bordr

A fast and accurate POS and morphological tagging toolkit, lightly adapted to Tibetan language.


Keywords
part-of-speech-tagger, java, nlp, pos-tagging, pos-tagger, python3
License
GPL-3.0
Install
pip install bordr==0.1.4

Documentation

bordr

A pip installable version of RDRPOSTagger with Tibetan-specific changes.

Maintenance

Build the source dist:

rm -rf dist/
python3 setup.py clean sdist

and upload on twine (version >= 1.11.0) with:

twine upload dist/*

Latest change

The SDICT content passed to generate INIT file is changed. The words in SDICT are given U(Unique tag from bilou tagging system) tag as those words are segmented as Unique token by botok. With that changed SDICT content, we will get INIT file based on botok segmentation. Hence rules generated will be able to resolve botok segmentation ambiguity.