SidekickAI

The SidekickAI experimental library.


Keywords
PyTorch, NLP, Data
License
MIT
Install
pip install SidekickAI==0.2.6

Documentation

SidekickAI

The Sidekick AI experimental codebase. This repository contains models, data handling functions, and utilities to help with rapid experimentation.

Install at https://pypi.org/project/SidekickAI/ For some reason, this import does not work (subfolders are not detected)

Data

These scripts contain tools for manipulating and preprocessing NLP data.

Tokenization

A section containing different tokenizer utilities as well as vocab files for each pretrained tokenizer.

Vocab

A class for vocabs with helper functions built in, such as tokens_to_index and index_to_tokens. Also functions to load different vocabs for different tokenizers. Under Production

Batching

A class for batching functions to make padded batches, get lengths and masks from batches, and sort / shuffle an entire unbatched dataset.

Models

These scripts contain useful modules to use to build models

Seq2Seq

An RNN Based Seq2Seq script, containing RNNEncoder, RNNDecoder, and a full Seq2Seq model with optional attention and dynamic stopping. These support SimpleRNN, GRU, and LSTM.

Attention

This contains different types of attention modules, such as content based attention and multi headed self attention.

Transformers

This contains different transformer modules, such as TransformerEncoder, TransformerDecoder, and TransformerSeq2Seq