
Deep Neural Networks for Natural Language Processing classification or sequential task written by PyTorch.

pip install dnnnlp==1.1.4


PyTorch - Deep Neural Network - Natural Language Processing

Version 1.1 by KzXuan

Contains CNN, RNN and Transformer layers and models implemented by pytorch for classification and sequence labeling tasks in NLP.

  • Newly designed modules.
  • Reduce usage complexity.
  • Use mask as the sequence length identifier.
  • Multi-GPU parallel for grid search.


python 3.5+ & pytorch 1.2.0+


> pip install dnnnlp

API Document (In Chinese)


Name Type Default Description
n_gpu int 1 The number of GPUs (0 means no GPU acceleration).
space_turbo bool True Accelerate with more GPU memories.
rand_seed int 100 Random seed setting.
data_shuffle bool Ture Disrupt data for training.
emb_type str None Embedding modes contain None, 'const' or 'variable'.
emb_dim int 300 Embedding dimension (or feature dimension).
n_class int 2 Number of target classes.
n_hidden int 50 Number of hidden nodes, or output channels of CNN.
learning_rate float 0.01 Learning rate.
l2_reg float 1e-6 L2 regular.
batch_size int 32 Number of samples for one batch.
iter_times int 30 Number of iterations.
display_step int 2 The number of iterations between each output of the result.
drop_prob float 0.1 Dropout ratio.
eval_metric str 'accuracy' Evaluation metrics contain 'accuracy', 'macro', 'class1', etc.


# import our modules
from dnnnlp.model import RNNModel
from dnnnlp.exec import default_args, Classify

# load the embedding matrix
emb_mat = np.array((-1, 300))
# load the train data
train_x = np.array((800, 50))
train_y = np.array((800,))
train_mask = np.array((800, 50))
# load the test data
test_x = np.array((200, 50))
test_y = np.array((200,))
test_mask = np.array((200, 50))

# get the default arguments
args = default_args()
# modify part of the arguments
args.space_turbo = False
args.n_hidden = 100
args.batch_size = 32
  • Classification
# initilize a model
model = RNNModel(args, emb_mat, bi_direction=False, rnn_type='GRU', use_attention=True)
# initilize a classifier
nn = Classify(model, args, train_x, train_y, train_mask, test_x, test_y, test_mask)
# do training and testing
evals = nn.train_test(device_id=0)
  • Run several times and get the average score.
# initilize a model
model = CNNModel(args, emb_mat, kernel_widths=[2, 3, 4])
# initilize a classifier
nn = Classify(model, args, train_x, train_y, train_mask)
# run the model several times
avg_evals = average_several_run(nn.train_test, args, n_times=8, n_paral=4, fold=5)
  • Parameters' grid search.
# initilize a model
model = TransformerModel(args, n_layer=12, n_head=8)
# initilize a classifier
nn = Classify(model, args, train_x, train_y, train_mask, test_x, test_y, test_mask)
# set searching params
params_search = {'learning_rate': [0.1, 0.01], 'n_hidden': [50, 100]}
# run grid search
max_evals = grid_search(nn, nn.train_test, args, params_search)
  • Sequence labeling
from dnnnlp.model import RNNCRFModel
from dnnnlp.exec import default_args, SequenceLabeling

# load the train data
train_x = np.array((1000, 50))
train_y = np.array((1000, 50))
train_mask = np.array((1000, 50))

# initilize a model
model = RNNCRFModel(args)
# initilize a labeler
nn = SequenceLabeling(model, args, train_x, train_y, train_mask)
# do cross validation


version 1.1

  • Add CRFLayer: packaging CRF for both training and testing.
  • Add RNNCRFModel: a integrated RNN-CRF sequence labeling model.
  • Add SequenceLabeling: a sequence labeling execution module that inherits from Classify.
  • Fix errors in judging whether a tensor is None.

version 1.0

  • Rename project dnn to dnnnlp.
  • Remove file base, add file utils.
  • Optimize and rename SoftmaxLayer and SoftAttentionLayer.
  • Rewrite and rename EmbeddingLayer, CNNLayer and RNNLayer.
  • Rewrite MultiheadAttentionLayer: a packaging attention layer based on nn.MultiheadAttention.
  • Rewrite TransformerLayer: support new MultiheadAttentionLayer.
  • Optimize and rename CNNModel, RNNModel and TransformerModel.
  • Optimize and rename Classify: a highly applicable classification execution module.
  • Rewrite average_several_run and grid_search: support multi-GPU parallel.
  • Support pytorch 1.2.0.

version 0.12

  • Update RNN_layer: fully support for tanh, LSTM and GRU.
  • Fix errors in some mask operations.
  • Support pytorch 1.1.0.

Old version 0.12.3.

version 0.11

  • Provides an acceleration method by using more GPU memories.
  • Fix the problem of memory consumption caused by abnormal data reading.
  • Add multi_head_attention_layer: packaging multi-head attention for Transformer.
  • Add Transformer_layer and Transformer_model: packaging Transformer layer and model written by ourself.
  • Support data disruption for training.

version 0.10

  • Split the code into four files: base, layer, model, exec.
  • Add CNN_layer and CNN_model: packaging CNN layer and model.
  • Support multi-GPU parallel for each model.

version 0.9

  • Fix the problem of output format.
  • Fix the statistical errors in cross-validation part of LSTM_classify.
  • Rename: LSTM_model to RNN_layer, self_attention to self_attention_layer.
  • Add softmax_layer: a packaging fully-connected layer.

version 0.8

  • Adjust the applicability of functions in LSTM_classify to avoid rewriting in LSTM_sequence.
  • Optimize the way of parameter transfer.
  • A more complete evaluation mechanism.

version 0.7

  • Add LSTM_sequence: a sequence labeling module for LSTM_model.
  • Fix the nan-value problem in hierarchical classification.
  • Support pytorch 1.0.0.

version 0.6

  • Update LSTM_classify: support hierarchical classification.
  • The GRU_model is merged into the LSTM_model.
  • Adapt to CPU operation.

version 0.5

  • Split the running part of LSTM_classify to reduce the rewrite of custom models.
  • Add control for visual output.
  • Create function average_several_run: support to get the average score after several training and testing.
  • Create function grid_search: support parameters' grid search.

version 0.4

  • Add GRU_model: a packaging GRU model based on nn.GRU.
  • Support L2 regular.

version 0.3

  • Add self_attention: provides attention mechanism support.
  • Update LSTM_classify: adapts to complex custom models.

version 0.2

  • Support mode selection of embedding.
  • Default usage of nn.Dropout.
  • Create function default_args to provide default hyperparameters.

version 0.1

  • Initilization of project dnn: based on pytorch 0.4.1.
  • Add LSTM_model: a packaging LSTM model based on nn.LSTM.
  • Add LSTM_classify: a classification module for LSTM model, which supports train-test and corss-validation.