Corpus library

corpus-processing, sentence-segmentation, sentence-tokenizer, turkish-sentence-segmentation, turkish-sentence-tokenizer
pip install NlpToolkit-Corpus==1.0.11


For Developers

You can also see Java, C++, or C# repository.



To check if you have a compatible version of Python installed, use the following command:

python -V

You can find the latest version of Python here.


Install the latest version of Git.

Download Code

In order to work on code, create a fork from GitHub page. Use Git for cloning the code to your local or below line for Ubuntu:

git clone <your-fork-git-link>

A directory called Corpus will be created. Or you can use below link for exploring the code:

git clone

Open project with Pycharm IDE

Steps for opening the cloned project:

  • Start IDE
  • Select File | Open from main menu
  • Choose Corpus-Py file
  • Select open as project option
  • Couple of seconds, dependencies will be downloaded.