Simple, Keras-powered multilingual NLP framework, allows you to build your models in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks. Includes BERT, GPT-2 and word2vec embedding.


Keywords
nlp, machine-learning, text-classification, named-entity-recognition, seq2seq, transfer-learning, ner, bert, sequence-labeling, nlp-framework, bert-model, text-labeling, gpt-2
License
Apache-2.0
Install
pip install kashgari-tf==0.5.5

Documentation

Kashgari

GitHub Slack Coverage Status PyPI

Overview | Performance | Quick start | Documentation | Contributing

🎉🎉🎉 We are proud to announce that we entirely rewrote Kashgari with tf.keras, now Kashgari comes with easier to understand API and is faster! 🎉🎉🎉

Overview

Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.

  • Human-friendly. Kashgari's code is straightforward, well documented and tested, which makes it very easy to understand and modify.
  • Powerful and simple. Kashgari allows you to apply state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS) and classification.
  • Built-in transfer learning. Kashgari built-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.
  • Fully scalable. Kashgari provides a simple, fast, and scalable environment for fast experimentation, train your models and experiment with new approaches using different embeddings and model structure.
  • Production Ready. Kashgari could export model with SavedModel format for tensorflow serving, you could directly deploy it on the cloud.

Our Goal

  • Academic users Easier experimentation to prove their hypothesis without coding from scratch.
  • NLP beginners Learn how to build an NLP project with production level code quality.
  • NLP developers Build a production level classification/labeling model within minutes.

Contributors

Thanks goes to these wonderful people. And there are many ways to get involved. Start with the contributor guidelines and then check these open issues for specific tasks.


Eliyar Eziz

📖 ⚠️ 💻

Alex Wang

💻

Yusup

💻

Adline

💻

Road Map

  • Based on TensorFlow 2.0+ [@BrikerMan]
  • Fully support generator based training (#336 ,#273) [@BrikerMan]
  • Clean code and full document
  • Multi GPU/TPU Support [@BrikerMan]
  • Embeddings
    • Bare Embedding [@BrikerMan]
    • Word Embedding (Load trained W2V) [@BrikerMan]
    • BERT Embedding (Based on bert4keras, support BERT, RoBERTa, ALBERT...) (#316) [@BrikerMan]
    • GPT-2 Embedding
    • FeaturesEmbedding (Support Numeric feature as input)
    • Stacked Embedding (Stack Text embedding and features Embedding)
  • Classification Task
  • Labeling Task
  • Seq2Seq Task
  • Built-in Callbacks
    • Evaluate Callback
    • Save Best Callback
  • Support TensorFlow Hub (Optional)