Yet Another Reinforcement Learning Library (YARLL)

Update 25/03/2019: For now, the master branch won't get big changes. Instead, algorithms are adapted for TensorFlow 2 (and new ones may be added) on the TF2 branch.
Update 29/10/2018: New library name.
Update 25/10/2018: Added SAC implementation.

Status

Different algorithms have currently been implemented (in no particular order):

Advantage Actor Critic
Asynchronous Advantage Actor Critic (A3C)
Deep Deterministic Policy Gradient (DDPG)
Proximal Policy Optimization (PPO)
- Distributed version (DPPO)
Soft Actor-Critic (SAC)
Trust Region Policy Optimization (TRPO)
- Distributed version (DTRPO)
REINFORCE (convolutional neural network part has not been tested yet)
Cross-Entropy Method
Sarsa with with function approximation and eligibility traces
Karpathy's policy gradient algorithm (version using convolutional neural networks has not been tested yet)
(Sequential) knowledge transfer
Asynchronous knowledge transfer

Asynchronous Advantage Actor Critic (A3C)

The code for this algorithm can be found here. Example run after training using 16 threads for a total of 5 million timesteps on the PongDeterministic-v4 environment:

How to run

First, install the library using pip (you can first remove OpenCV from the setup.py file if it is already installed):

pip install yarll

Algorithms/experiments

You can run algorithms by passing the path to an experiment specification (which is a file in json format) to main.py:

python -m yarll.main <path_to_experiment_specification>

Examples of experiment specifications can be found in the experiment_specs folder.

Statistics

Statistics can be plot using:

python -m yarll.misc.plot_statistics <path_to_stats>

<path_to_stats> can be one of 2 things:

A json file generated using gym.wrappers.Monitor, in case it plots the episode lengths and total reward per episode.
A directory containing TensorFlow scalar summaries for different tasks, in which case all of the found scalars are plot.

Help about other arguments (e.g. for using smoothing) can be found by executing python -m yarll.misc.plot_statistics -h.

Alternatively, it is also possible to use Tensorboard to show statistics in the browser by passing the directory with the scalar summaries as --logdir argument.

yarll
Release 0.0.12

Release 0.0.12

0.0.12

0.0.11

0.0.10

0.0.9

0.0.8

0.0.7.post0

0.0.7

0.0.5

0.0.4

0.0.3

Documentation

Yet Another Reinforcement Learning Library (YARLL)

Status

Asynchronous Advantage Actor Critic (A3C)

How to run

Algorithms/experiments

Statistics

Stats

Development practices

Releases

Contributors

yarll Release 0.0.12

Release 0.0.12 Toggle Dropdown 0.0.12 0.0.11 0.0.10 0.0.9 0.0.8 0.0.7.post0 0.0.7 0.0.5 0.0.4 0.0.3

Documentation

Yet Another Reinforcement Learning Library (YARLL)

Status

Asynchronous Advantage Actor Critic (A3C)

How to run

Algorithms/experiments

Statistics

Stats

Development practices

Releases

Contributors

yarll
Release 0.0.12

Release 0.0.12

0.0.12

0.0.11

0.0.10

0.0.9

0.0.8

0.0.7.post0

0.0.7

0.0.5

0.0.4

0.0.3