the Old Chinese (
och) language for the spaCy NLP library.
requires spacy v3.
$ pip install spacy-och
this package currently doesn't include trained models and is intended for basic NLP usage only, via
nlp.blank(). it tokenizes texts by character and supports the
>>> import spacy >>> nlp = spacy.blank("och") >>> from spacy_och.examples import sentences >>> doc = nlp(sentences) >>> doc.text 子曰：「上下无常，非為邪也。進退无恆，非離群也。君子進德脩業、欲及時也，故无咎。」 >>> [t for t in doc if t.is_stop] # all stop words [曰, ：, 非, 也, 。, 非, 也, 。, 、, 欲, 及, 也, 故, 。]
more functionality is coming soon!
after cloning the repository:
$ pip install -e ".[dev]" $ pre-commit install
build a source archive and distribution for a release:
$ rm -rf dist/* $ python -m build
publish the release on test PyPI (useful for making sure everything worked):
$ python -m twine upload --repository testpypi dist/*
if everything looks ok, upload to the real PyPI:
$ python -m twine upload dist/*