bendeep

BENDeep is a pytorch based deep learning solution for Bengali NLP Task


Keywords
bangla, bengali, bengali-sentiment-analysis, bengali-translation, pytorch, sentiment-analysis
License
MIT
Install
pip install bendeep==1.2

Documentation

BENDeep

BENDeep is a pytorch based deep learning solution for Bengali NLP Task like bengali translation, bengali sentiment analysis and so on.

Installation

pip install bendeep

Dependency

  • pytorch 1.5.0+

Pretrained Model

API

Sentiment Analysis

Analyzing Sentiment

This sentiment analysis model is a RNN based GRU model trained with socian sentiment dataset with loss 0.073 in 150 epochs. Dataset size: 4000 sentences

from bendeep import sentiment
model_path = "senti_trained.pt"
vocab_path = "vocab.txt"
text = "āĻ°ā§‹āĻšāĻŋāĻ™ā§āĻ—āĻž āĻŽā§āĻ¸āĻ˛āĻŽāĻžāĻ¨āĻĻā§‡āĻ° āĻĻā§āĻ°ā§āĻ­ā§‹āĻ—ā§‡āĻ° āĻ…āĻ¨ā§āĻ¤ āĻ¨ā§‡āĻ‡āĨ¤āĻœāĻ˛ā§‡ āĻ•ā§āĻŽāĻŋāĻ° āĻĄāĻžāĻ‚āĻ—āĻžā§Ÿ āĻŦāĻžāĻ˜āĨ¤āĻ†āĻœāĻ•ā§‡ āĻĻā§āĻŸāĻŋ āĻ˜āĻŸāĻ¨āĻž āĻ†āĻŽāĻžāĻ•ā§‡ āĻ­ā§€āĻˇāĻŖ āĻŦā§āĻ¯āĻ¤āĻŋāĻ¤ āĻ•āĻ°ā§‡āĻ›ā§‡āĨ¤āĻ¨āĻŋāĻ°āĻŦā§‡ āĻ•āĻŋāĻ›ā§āĻ•ā§āĻˇāĻ¨ āĻ…āĻļā§āĻ°ā§ āĻŦāĻŋāĻ¸āĻ°ā§āĻœāĻ¨ āĻĻāĻŋā§Ÿā§‡ āĻŽāĻ¨āĻŸāĻžāĻ•ā§‡ āĻšāĻžāĻ˛ā§āĻ•āĻž āĻ•āĻ°āĻžāĻ° āĻŦā§āĻ¯āĻ°ā§āĻĨ āĻĒā§āĻ°ā§ŸāĻžāĻ¸ āĻšāĻžāĻ˛āĻŋā§Ÿā§‡āĻ›āĻŋāĨ¤"

sentiment.analyze(model_path, vocab_path, text)

Training Sentiment Model

To train this model you need a csv file with one column review means text and another column sentiment with 0 or 1, where 1 for positive and 0 for negative sentiment.

Example:

,review,sentiment
0,āĻ¤ā§‹āĻŽāĻžāĻ•ā§‡ āĻ–ā§āĻŦ āĻ¸ā§āĻ¨ā§āĻĻāĻ° āĻ˛āĻžāĻ—āĻ›ā§‡āĨ¤,1
1,āĻ†āĻœāĻ•ā§‡āĻ° āĻ†āĻŦāĻšāĻžāĻ“ā§ŸāĻž āĻ–ā§āĻŦ āĻ–āĻžāĻ°āĻžāĻĒāĨ¤,0
review sentiment
0 āĻ¤ā§‹āĻŽāĻžāĻ•ā§‡ āĻ–ā§āĻŦ āĻ¸ā§āĻ¨ā§āĻĻāĻ° āĻ˛āĻžāĻ—āĻ›ā§‡āĨ¤ 1
1 āĻ†āĻœāĻ•ā§‡āĻ° āĻ†āĻŦāĻšāĻžāĻ“ā§ŸāĻž āĻ–ā§āĻŦ āĻ–āĻžāĻ°āĻžāĻĒāĨ¤ 0
from bendeep import sentiment
data_path = "sentiment_data.csv"
sentiment.train(data_path)
# you can also pass these parameter
# sentiment.train(data_path, batch_size = 64, epochs=100, model_name="trained.pt")

after successfully training it will complete training and save model as trained.pt also save vocab file as vocab.txt

Machine Translation

Translate Bengali to English

This model is a seq2seq attentional model trained with this dataset with loss 0.0.

from bendeep import translation
from bendeep.translation import EncoderRNN
from bendeep.translation import AttnDecoderRNN

data_path = "data/translation/eng-ben.txt"
encoder = "models/translation/encoder.pt"
decoder = "models/translation/decoder.pt"
input_sentence = "āĻ†āĻŽāĻžāĻ° āĻļā§€āĻ¤ āĻ•āĻ°āĻ›ā§‡āĨ¤"
translation.bn2en(data_path, encoder, decoder, input_sentence)
# outupt
# > āĻ†āĻŽāĻžāĻ° āĻļā§€āĻ¤ āĻ•āĻ°āĻ›ā§‡ āĨ¤
# = i feel cold .

Training Translation Model

To train translation model you need a dataset in .txt format with tab separate input and target sentences.

Example:

I eat rice. āĻ†āĻŽāĻŋ āĻ­āĻžāĻ¤ āĻ–āĻžāĻ‡āĨ¤
He goes to school.  āĻ¸ā§‡ āĻŦāĻŋāĻĻā§āĻ¯āĻžāĻ˛ā§Ÿā§‡ āĻ¯āĻžā§ŸāĨ¤
from bendeep import translation
from bendeep.translation import EncoderRNN
from bendeep.translation import AttnDecoderRNN

data_path = "data/translation/eng-ben.txt"
translation.training(data_path, iteration=75000)

after successfully training it will complete training and save encoder and decoder model as encoder.pt, decoder.pt. Also display some random evaluation results.

References