
Python morphological analyzer and lemmatizer for Turkish

zeyrek, lemmatization, morphological-analysis, morphology, nlp, turkish
pip install zeyrek==0.1.4a0


Zeyrek: Morphological Analyzer and Lemmatizer

Documentation Status


Zeyrek is a partial port of Zemberek library to Python for lemmatizing and analyzing Turkish language words. It is in alpha stage, and the API will probably change.

Basic Usage

To use Zeyrek, first create an instance of MorphAnalyzer class:

>>> import zeyrek
>>> analyzer = zeyrek.MorphAnalyzer()

Then, you can call its analyze method on words or texts to get all possible analyses:

>>> print(analyzer.analyze('benim'))
Parse(word='benim', lemma='ben', pos='Noun', morphemes=['Noun', 'A3sg', 'P1sg'], formatted='[ben:Noun] ben:Noun+A3sg+im:P1sg')
Parse(word='benim', lemma='ben', pos='Pron', morphemes=['Pron', 'A1sg', 'Gen'], formatted='[ben:Pron,Pers] ben:Pron+A1sg+im:Gen')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Noun] ben:Noun+A3sg|Zero→Verb+Pres+im:A1sg')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Pron', 'A1sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Pron,Pers] ben:Pron+A1sg|Zero→Verb+Pres+im:A1sg')

If you only need the base form of words, or lemmas, you can call lemmatize. It returns a list of tuples, with word itself and a list of possible lemmas:

>>> print(analyzer.lemmatize('benim'))
[('benim', ['ben'])]


This package is a Python port of part of the Zemberek package by Ahmet A. Akın

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.