chinormfilter
Filter synonym written in lucene format to avoid duplication with Sudachi normalization. Mainly used when migrating to sudachi analyzer.
Usage
$ chinormfilter tests/test.txt -o out.txt
filtered result is following.
γ¬γγͺγγγ,γ¬γγͺγγγ€γ
γͺγ³γ΄ => ζζͺ
ι£²γ,εγ
tlc => tlc,ε
¨θΊζ°ι
γͺγ³γγγ±γθ³ͺ,γͺγ³θη½θ³ͺ,γͺγ³γΏγ³γγ―θ³ͺ
β filter
γ¬γγͺγγγ,γ¬γγͺγγγ€γ
tlc => tlc,ε
¨θΊζ°ι
Specify system dict
$ chinormfilter tests/test.txt -s full -o out.txt
Use Custom Dict
Specify dict via sudachi.json
$ chinormfilter tests/test.txt -s sudachi.json -o out.txt
TODO
- custom dict test