Multilingual POS-tagger and Dependency-parser


Keywords
NLP, Multilingual
License
GPL-2.0+
Install
pip install multicombo==0.8.4

Documentation

Current PyPI packages

MultiCOMBO

Multilingual POS-Tagger and Dependency-Parser with COMBO-pytorch and spaCy

Basic usage

>>> import multicombo
>>> nlp=multicombo.load()
>>> doc=nlp('Who plays "La vie en rose"?')
>>> print(multicombo.to_conllu(doc))
# text = Who plays "La vie en rose"?
1	Who	_	PRON	_	PronType=Int	2	nsubj	_	Translit=who
2	plays	_	VERB	_	Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin	0	root	_	_
3	"	_	PUNCT	_	_	5	punct	_	SpaceAfter=No
4	La	_	DET	_	Definite=Def|Gender=Fem|Number=Sing|PronType=Art	5	det	_	Translit=la
5	vie	_	NOUN	_	Gender=Fem|Number=Sing	2	obj	_	_
6	en	_	ADP	_	_	7	case	_	_
7	rose	_	NOUN	_	Number=Sing	5	nmod	_	SpaceAfter=No
8	"	_	PUNCT	_	_	5	punct	_	SpaceAfter=No
9	?	_	PUNCT	_	_	2	punct	_	SpaceAfter=No

>>> import deplacy
>>> deplacy.render(doc)
Who   PRON  <════════════╗   nsubj
plays VERB  ═══════════╗═╝═╗ ROOT
"     PUNCT <══════╗   ║   ║ punct
La    DET   <════╗ ║   ║   ║ det
vie   NOUN  ═══╗═╝═╝═╗<╝   ║ obj
en    ADP   <╗ ║     ║     ║ case
rose  NOUN  ═╝<╝     ║     ║ nmod
"     PUNCT <════════╝     ║ punct
?     PUNCT <══════════════╝ punct

>>> deplacy.serve(doc)
http://127.0.0.1:5000

trial.svg multicombo.load(lang="xx") loads spaCy Language pipeline with bert-base-multilingual-cased and spacy.lang.xx.MultiLanguage tokenizer. Other language specific tokenizers can be loaded with the option lang, while several languages require additional packages:

Installation for Linux

pip3 install multicombo --user

Installation for Cygwin64

Make sure to get python37-devel python37-pip python37-cython python37-numpy python37-cffi gcc-g++ mingw64-x86_64-gcc-g++ gcc-fortran git curl make cmake libopenblas liblapack-devel libhdf5-devel libfreetype-devel libuv-devel packages, and then:

curl -L https://raw.githubusercontent.com/KoichiYasuoka/UniDic-COMBO/master/cygwin64.sh | sh
pip3.7 install multicombo

Installation for Jupyter Notebook (Google Colaboratory)

!pip install multicombo

Try notebook for Google Colaboratory.