maximal

Current version: 1.1

A TensorFlow-compatible Python library that provides models and layers to implement custom Transformer neural networks.

Built on TensorFlow 2.

Logo generated by Stable Diffusion 2.1

Installation

Its installation is straightforward:

pip install maximal

maximal is commonly called as:

import maximal
from maximal.layers import TransformerLayer, GPTLayer

and can be used in a tf.keras model as any common layer.

An Official Website is now available with documentation and tutorials.

In layers.py:

SelfAttention: keras.Layer, computes Scaled Dot-Product Attention.
MultiHeadSelfAttention: keras.Layer, it is a concatenation of SelfAttention layers, resized back to original input shape through linear transformation.
PositionalEmbedding: keras.Layer, implements double Embedding layers used in Transformers literature, for tokens and positions. Positional encoding is learned through a tf.keras.layers.Embedding() layer, instead of deterministic positional encoding in the original paper.
TransformerLayer: keras.Layer single Transformer Encoder piece. It can be used inside any Sequential() model in Keras.
GPTLayer: keras.Layer GPT block. Similar to TransformerLayer but with causal Attention mechanism. It can be used inside any Sequential() model in Keras.

In schedules.py:

OriginalTransformerSchedule: keras.Layer implements the learning rate schedule of the original Transformer paper. It is taken from this official TensorFlow tutorial.

h5py
numpy
tensorflow >= 2.0

Ivan Bongiorni. LinkedIn

2020 Ivan Bongiorni

This repository is licensed under the MIT license. See LICENCE.txt for further details.