Playground for testing theories of semantic development in recurrent neural networks.


Keywords
rnn, tensorflow, language modeling, neural-network, nlp-machine-learning
License
MIT
Install
pip install rnnlab==6.0.0

Documentation

Rnnlab

Python API to train and compare RNN language models using Tensorflow

Installation:

Install rnnlab via pip:

pip install rnnlab

This will install required python modules, including numpy, tensorflow, pandas, sklearn, matplotlib, flask, spacy, and seaborn.

Environment Variables:

First, tell rnnlab where to save its log and training data by creating environment variables in bashrc file:

export RUNS_DIR='<path to where you want to save model data>'
export BACKUP_DIR_DIR='<path to where you want to backup model data>'
export RNNLAB_DIR='<path to where you want to save the log files>'

Configs:

Before training can begin, run the following command to create configs file in RNNLAB_DIR

python train.py

A variety of training hyperparameters and other configurations may be specified here. Not all are required. Each row represents one configuration, and multiple unique configurations may be specified here. A bare-bones example to train a SRN with the supplied CHILDES (MacWhinney, 1984) data split into 256 docs, and iterating 20 times over each:

learning_rate bptt_steps corpus_name num_types num_parts num_iterations
0.01 7 childes-20171213 4096 256 20

Training:

rnnlab comes packaged with the corpus childes-20171213 so that you can test your implementation right out of the box. Just make sure you specify the name of the corpus in your configs file.

To train one model per configuration specified in your configurations file, simply run the example with a command line argument specifying the model type:

python train.py

Note, that if you would like to train 3 replicas per configuration, you can:

python train.py -r3
If rnnlab's log already contains 3 replicas of a particular configuration, training of that configuration will be skipped.
Instead you can increase the number of replicas, or add an additional argument which turns off skipping:
python train.py -r3 --noskip

Analysis using the browser app

During training, hidden state activations for user-specified words (probes) are collected and saved to disk. An included browser application can visualize the data during and after training. After you have started training a model, a bash alias will have been created for easy access to the browser app. In a new bash terminal, type:

python app.py
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

Here you can plot various evaluation metrics across training time, SVD and t-SNE of the learned representations of probe words, model comparisons, and much more.

Important Note

rnnlab is still in the early stages of development. The package is aimed primarily for enabling replication studies.

Project Information

rnnlab is released under the MIT license, the code on GitHub, and the latest release on PyPI. Tested on Python 3.5.