BENDeep
BENDeep
is a pytorch based deep learning solution for Bengali NLP Task like bengali translation
, bengali sentiment analysis
and so on.
Installation
pip install bendeep
Dependency
- pytorch 1.5.0+
Pretrained Model
API
Sentiment Analysis
Analyzing Sentiment
This sentiment analysis model is a RNN based GRU
model trained with socian sentiment dataset with loss 0.073 in 150 epochs.
Dataset size: 4000 sentences
from bendeep import sentiment
model_path = "senti_trained.pt"
vocab_path = "vocab.txt"
text = "āĻ°ā§āĻšāĻŋāĻā§āĻāĻž āĻŽā§āĻ¸āĻ˛āĻŽāĻžāĻ¨āĻĻā§āĻ° āĻĻā§āĻ°ā§āĻā§āĻā§āĻ° āĻ
āĻ¨ā§āĻ¤ āĻ¨ā§āĻāĨ¤āĻāĻ˛ā§ āĻā§āĻŽāĻŋāĻ° āĻĄāĻžāĻāĻāĻžā§ āĻŦāĻžāĻāĨ¤āĻāĻāĻā§ āĻĻā§āĻāĻŋ āĻāĻāĻ¨āĻž āĻāĻŽāĻžāĻā§ āĻā§āĻˇāĻŖ āĻŦā§āĻ¯āĻ¤āĻŋāĻ¤ āĻāĻ°ā§āĻā§āĨ¤āĻ¨āĻŋāĻ°āĻŦā§ āĻāĻŋāĻā§āĻā§āĻˇāĻ¨ āĻ
āĻļā§āĻ°ā§ āĻŦāĻŋāĻ¸āĻ°ā§āĻāĻ¨ āĻĻāĻŋā§ā§ āĻŽāĻ¨āĻāĻžāĻā§ āĻšāĻžāĻ˛ā§āĻāĻž āĻāĻ°āĻžāĻ° āĻŦā§āĻ¯āĻ°ā§āĻĨ āĻĒā§āĻ°ā§āĻžāĻ¸ āĻāĻžāĻ˛āĻŋā§ā§āĻāĻŋāĨ¤"
sentiment.analyze(model_path, vocab_path, text)
Training Sentiment Model
To train this model you need a csv file with one column review
means text and another column sentiment
with 0 or 1, where 1 for positive and 0 for negative sentiment.
Example:
,review,sentiment
0,āĻ¤ā§āĻŽāĻžāĻā§ āĻā§āĻŦ āĻ¸ā§āĻ¨ā§āĻĻāĻ° āĻ˛āĻžāĻāĻā§āĨ¤,1
1,āĻāĻāĻā§āĻ° āĻāĻŦāĻšāĻžāĻā§āĻž āĻā§āĻŦ āĻāĻžāĻ°āĻžāĻĒāĨ¤,0
review | sentiment | |
---|---|---|
0 | āĻ¤ā§āĻŽāĻžāĻā§ āĻā§āĻŦ āĻ¸ā§āĻ¨ā§āĻĻāĻ° āĻ˛āĻžāĻāĻā§āĨ¤ | 1 |
1 | āĻāĻāĻā§āĻ° āĻāĻŦāĻšāĻžāĻā§āĻž āĻā§āĻŦ āĻāĻžāĻ°āĻžāĻĒāĨ¤ | 0 |
from bendeep import sentiment
data_path = "sentiment_data.csv"
sentiment.train(data_path)
# you can also pass these parameter
# sentiment.train(data_path, batch_size = 64, epochs=100, model_name="trained.pt")
after successfully training it will complete training and save model as trained.pt
also save vocab file as vocab.txt
Machine Translation
Translate Bengali to English
This model is a seq2seq attentional model trained with this dataset with loss 0.0.
from bendeep import translation
from bendeep.translation import EncoderRNN
from bendeep.translation import AttnDecoderRNN
data_path = "data/translation/eng-ben.txt"
encoder = "models/translation/encoder.pt"
decoder = "models/translation/decoder.pt"
input_sentence = "āĻāĻŽāĻžāĻ° āĻļā§āĻ¤ āĻāĻ°āĻā§āĨ¤"
translation.bn2en(data_path, encoder, decoder, input_sentence)
# outupt
# > āĻāĻŽāĻžāĻ° āĻļā§āĻ¤ āĻāĻ°āĻā§ āĨ¤
# = i feel cold .
Training Translation Model
To train translation model you need a dataset in .txt
format with tab separate input
and target
sentences.
Example:
I eat rice. āĻāĻŽāĻŋ āĻāĻžāĻ¤ āĻāĻžāĻāĨ¤
He goes to school. āĻ¸ā§ āĻŦāĻŋāĻĻā§āĻ¯āĻžāĻ˛ā§ā§ āĻ¯āĻžā§āĨ¤
from bendeep import translation
from bendeep.translation import EncoderRNN
from bendeep.translation import AttnDecoderRNN
data_path = "data/translation/eng-ben.txt"
translation.training(data_path, iteration=75000)
after successfully training it will complete training and save encoder and decoder model as encoder.pt
, decoder.pt
. Also display some random evaluation results.