transformer-model

A Transformer model implementation in TensorFlow 2.0


Keywords
Transformer, seq2seq, SelfAttention
License
MIT
Install
pip install transformer-model==1.5.1

Documentation

TransformerModel

The Transformer model described in the "Attention is all you need" paper written in Tensorflow 2.0.

This package has been written very abstractly so that you can feel free to use any of the classes involved. For example, you can hack the Encoder and Decoder classes to create other models such as BERT, Transformer-XL, GPT, GPT-2, and even XLNet if you're brave enough. All that you would need to adjust is the masking class.


Requirements

  • matplotlib
  • numpy
  • pandas
  • tensorflow-datasets>=1.1.0
  • tensorflow>=2.0.0b1

How to install

pip install transformer-model

I highly recommend creating a new virtual environment for you project as this package does require Tensorflow 2.0 which as of now is still in beta.

GPU

You can utilize your GPU by installing the gpu version of Tensorflow 2.0 pip install tensorflow-gpu==2.0.0b1


How to use

I have left an example file (example.py) file for you to get a feel of how this model works. The model, in it's most basic form, takes a numpy array as the input and returns a numpy array as the output.

The common use case with this model is for language translation. In that case, you would train this model with the feature column being the original language and the target column being the language you want to translate to.

1. Generate your input/output.

  • I have provided a few helper functions for this, but you essentially need to generate two Tensorflow tokenizers as well as a pandas DataFrame with feature and target columns.
  • You can utilize the helper functions in the DataProcess class to generate TensorDatasets from your DataFrame, as well as perform a train_test_split.

2. Learning Rate / Optimizer

  • The Transformer model excels when set on a custom learning rate with a sharp incline and then exponential decay.
  • This work has been implemented in the CustomSchedule() class. Feel free to play around with the number of warm up steps to complete the schedule!

3. Define the Transformer Model

  • Create the HPARAMS that will define how large the model should be.

4. Train

  • Once you define the Trainer class, you only need to call the train() method on your trainer object.
  • This will return the training accuracy and loss

paypal