maximal
See the Official Documentation site
Current version: 1.1
A TensorFlow-compatible Python library that provides models and layers to implement custom Transformer neural networks.
Built on TensorFlow 2.
Logo generated by Stable Diffusion 2.1
Installation
Its installation is straightforward:
pip install maximal
How to use it?
maximal
is commonly called as:
import maximal
from maximal.layers import TransformerLayer, GPTLayer
and can be used in a tf.keras
model as any common layer.
Documentation
An Official Website is now available with documentation and tutorials.
Elements
In layers.py
:
-
SelfAttention
:keras.Layer
, computes Scaled Dot-Product Attention. -
MultiHeadSelfAttention
:keras.Layer
, it is a concatenation ofSelfAttention
layers, resized back to original input shape through linear transformation. -
PositionalEmbedding
:keras.Layer
, implements double Embedding layers used in Transformers literature, for tokens and positions. Positional encoding is learned through atf.keras.layers.Embedding()
layer, instead of deterministic positional encoding in the original paper. -
TransformerLayer
:keras.Layer
single Transformer Encoder piece. It can be used inside anySequential()
model in Keras. -
GPTLayer
:keras.Layer
GPT block. Similar toTransformerLayer
but with causal Attention mechanism. It can be used inside anySequential()
model in Keras.
In schedules.py
:
-
OriginalTransformerSchedule
:keras.Layer
implements the learning rate schedule of the original Transformer paper. It is taken from this official TensorFlow tutorial.
Requirements
h5py
numpy
tensorflow >= 2.0
Author
Ivan Bongiorni. LinkedIn
License
2020 Ivan Bongiorni
This repository is licensed under the MIT license. See LICENCE.txt for further details.