CUnlp v0.3.2 (beta)

What is CUnlp?

a python library for NLP tasks in thai language by using machine learning approach running on TensorFlow and Keras.

Features List

Model

Word tokenization
POS tagging
Sentiment analysis (soon)
Topic analysis (soon)
Latent analysis (soon)
Review analysis (soon)

Embedding

Word2Vector
Compare word similarity
K-nearest word similarity
Word substitution
Initialize embeddings

and more..

Requirement

Now our library only supported Python3 with TensorFlow v1.4.0rc0+ and Keras v2.1.5 installed.

Installation

Directly install via PyPI

$ pip install cunlp

Usage

import cunlp as cu

# Word tokenization
sentence = "สวัสดีชาวโลกเรามาช่วยตัดคำให้"
tokens_in_list = cu.model.tokenize(sentence)
tokens_in_string = cu.model.tokenize(sentence, listing=False)

# POS tagging
sentence = "ฉันชอบกินอาหารจีน"
tokens_of_sentence = cu.model.tokenize(sentence)
pos_of_words = cu.model.pos(tokens_of_sentence)

# Word embedding
word_a = "หมา"
word_b = "แมว"
word_c = "เสื้อฮาวาย"
vector_of_word_a = cu.embedding.vectorize(word_a)
vector_and_word_of_word_a = cu.embedding.vectorize_in_depth(word_a)
substituted_word_c = cu.embedding.substitute(word_c)

similarity_score = cu.embedding.compare_similarity(word_a, word_b)
similarity_score_with_substitution = cu.embedding.compare_similarity(word_a, word_b, sub=True)

top_three_similar = cu.embedding.most_similarity(word_c, rank=3)
top_three_similar_with_substitution = cu.embedding.most_similarity(word_c, rank=3, sub=True)

API

https://cunlp-api.herokuapp.com/tokenize?sentence=ทดสอบการตัดคำอย่างง่าย
https://cunlp-api.herokuapp.com/pos?sentence=ฉันชอบกินอาหารจีนมาก
https://cunlp-api.herokuapp.com/vectorize?word=แมว
https://cunlp-api.herokuapp.com/compare_similarity?word1=แมว&word2=หมา

*For testing only!

Benchmark

Task	Precision	Recall	F1-score	Detail
Word tokenization	0.97072	0.97052	0.97062	on BEST2010
Word embedding	-	-	-	view
POS tagging	0.81327	0.75963	0.78554	view

Contributor

Danupat Khamnuansin
jrkns

Nuttasit Mahakusolsirikul
nattasit-m

cunlp
Release 0.3.2

Release 0.3.2

0.3.2

0.3.1

0.3.0

0.2.9

0.2.8

0.2.7

0.2.6

0.2.5

0.2.4

0.2.3

Documentation

CUnlp v0.3.2 (beta)

What is CUnlp?

Features List

Model

Embedding

Requirement

Installation

Usage

API

Benchmark

Contributor

Stats

Development practices

Releases

Contributors

cunlp Release 0.3.2

Release 0.3.2 Toggle Dropdown 0.3.2 0.3.1 0.3.0 0.2.9 0.2.8 0.2.7 0.2.6 0.2.5 0.2.4 0.2.3

Documentation

CUnlp v0.3.2 (beta)

What is CUnlp?

Features List

Model

Embedding

Requirement

Installation

Usage

API

Benchmark

Contributor

Stats

Development practices

Releases

Contributors

cunlp
Release 0.3.2

Release 0.3.2

0.3.2

0.3.1

0.3.0

0.2.9

0.2.8

0.2.7

0.2.6

0.2.5

0.2.4

0.2.3