Life as a maintainer after the xz utils backdoor hack 👉 Watch now!

pysastra
Release 0.1.0

Lightweight Natural Language Processing for Indonesian Language.

Homepage PyPI Python

License: MIT
Install: pip install pysastra==0.1.0

Documentation

pySastra

Lightweight Natural Language Processing for Indonesian Language.

Design Plan

Planned	Pipeline	Description
🟠	Language	A text-processing pipeline.
🟡	Tokenizer	Segment text, and create Doc objects with the discovered segment boundaries.
🟠	Lemmatizer	Determine the base forms of words.
🟡	Morphology	Assign linguistic features like lemmas, noun case, verb tense etc. based on the word and its part-of-speech tag.
🟠	Tagger	Annotate part-of-speech tags on Doc objects.
🔄	DependencyParser	Annotate syntactic dependencies on Doc objects.
🔄	EntityRecognizer	Annotate named entities, e.g. persons or products, on Doc objects.
🔄	TextCategorizer	Assign categories or labels to Doc objects.
🔄	Matcher	Match sequences of tokens, based on pattern rules, similar to regular expressions.
🔄	PhraseMatcher	Match sequences of tokens based on phrases.
🔄	EntityRuler	Add entity spans to the Doc using token-based rules or exact phrase matches.
🔄	Sentencizer	Implement custom sentence boundary detection logic that doesn’t require the dependency parse.

🟢 Completed With Test 🟡 Completed 🟠 On Progress 🔄 Planned

reference : spaCy language pipeline

Dependencies: 1
Dependent packages: 0
Dependent repositories: 0
Total releases: 1
Latest release: Sep 30, 2020
First release: Sep 30, 2020
Stars: 0
Forks: 0
Watchers: 0
Contributors: 1
Repository size: 1.95 KB
SourceRank: 6

Source repo 2FA enabled: TEXT!
Package manager 2FA enabled: TEXT!
Is security responsive: TEXT!
Dependencies are managed: TEXT!
Issue-free release available: TEXT!
Succession plan available: TEXT!
Package manager 2FA enabled: TEXT!

Releases

0.1.0: Sep 30, 2020

Contributors

See all contributors

Something wrong with this page? Make a suggestion

Export .ABOUT file for this package

Last synced: 2021-02-20 12:34:50 UTC

Login to resync this project