mozcpy
Mozc for Python: yet another Kana-Kanji converter
INSTALLATION
$ pip install mozcpy
USAGE
import mozcpy
converter = mozcpy.Converter()
converter.convert('γΎγ»γγγγγγ')
# => 'ιζ³ε°ε₯³'
converter.convert('γΎγ»γγγγγγ', n_best=10)
# => ['ιζ³ε°ε₯³', 'ιζ³ζΆι€', 'ιζ³ηζ', 'ιζ³ε°ζ', 'ιζ³ζε', 'ιη ²ε°ε₯³', 'γγγ¦ε°ε₯³', 'ιζ³θ¨Όζ', 'ιζ³θ³ζ']
converter.convert_wakati('γγγͺγ«γγγγγͺγ')
# => 'γγ δ½ γ ζγ γͺγ'
converter.convert_wakati('γγγͺγ«γγγγγͺγ', n_best=3)
# => ['γγ δ½ γ ζγ γͺγ', 'γγ δ½ γ γγγ γͺγ', 'γγ δ½ γ ζγ γͺγ']
converter.wakati("γγγͺγ«γγγγγͺγ")
# => 'γγ γͺγ« γ γγγ γͺγ'
converter.wakati("γγγͺγ«γγγγγͺγ", n_best=10) # duplicatetions are ignored
# => ['γγ γͺγ« γ γγγ γͺγ']
FOR DEVELOPER
This module uses Git LFS to pull dictionary files.
ACKNOWLEDGEMENT
This module relies on Mozc and MeCab.
- . T. Kudo, T. Hanaoka, J. Mukai, Y. Tabata, H. Komatsu. 2011. Efficient dictionary and language model compression for input method editors. In Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011), pp 19-25.
- . T. Kudo, H. Komatsu, T. Hanaoka, A. Mukai, Y. Tabata, K. Yamamoto, Y. Matsumoto. 2004. Applying Conditional Random Fields to Japanese Morphological Analysis. In Proceedings of the EMNLP 2004, pp 230-237.