rldb

Performances of Reinforcement Learning Agents


Keywords
database, python3, reinforcement-learning, state-of-the-art
License
MIT
Install
pip install rldb==0.0.0

Documentation

rldb

Build Status

Environments tracked in rldb Papers tracked in rldb Repos tracked in rldb Algorithms tracked in rldb Entries tracked in rldb

Database of RL algorithms

Atari Space Invaders Scores MuJoCo Walker2d Scores
Atari Space Invaders Scores MuJoCo Walker2d Scores

Examples

You can use rldb.find_all({}) to retrieve all existing entries in rldb.

import rldb


all_entries = rldb.find_all({})

You can also filter entries by specifying key-value pairs that the entry must match:

import rldb


dqn_entries = rldb.find_all({'algo-nickname': 'DQN'})
breakout_noop_entries = rldb.find_all({
    'env-title': 'atari-breakout',
    'env-variant': 'No-op start',
})

You can also use rldbl.find_one(filter_dict) to find one entry that matches the key-value pair specified in filter_dict:

import rldb
import pprint


entry = rldb.find_one({
    'env-title': 'atari-pong',
    'algo-title': 'Human',
})
pprint.pprint(entry)
Output

{
    'algo-nickname': 'Human',
    'algo-title': 'Human',
    'env-title': 'atari-pong',
    'env-variant': 'No-op start',
    'score': 14.6,
    'source-arxiv-id': '1511.06581',
    'source-arxiv-version': 3,
    'source-authors': [   'Ziyu Wang',
                          'Tom Schaul',
                          'Matteo Hessel',
                          'Hado van Hasselt',
                          'Marc Lanctot',
                          'Nando de Freitas'],
    'source-bibtex': '@article{DBLP:journals/corr/WangFL15,\n'
                     '    author    = {Ziyu Wang and\n'
                     '                 Nando de Freitas and\n'
                     '                 Marc Lanctot},\n'
                     '    title     = {Dueling Network Architectures for Deep '
                     'Reinforcement Learning},\n'
                     '    journal   = {CoRR},\n'
                     '    volume    = {abs/1511.06581},\n'
                     '    year      = {2015},\n'
                     '    url       = {http://arxiv.org/abs/1511.06581},\n'
                     '    archivePrefix = {arXiv},\n'
                     '    eprint    = {1511.06581},\n'
                     '    timestamp = {Mon, 13 Aug 2018 16:48:17 +0200},\n'
                     '    biburl    = '
                     '{https://dblp.org/rec/bib/journals/corr/WangFL15},\n'
                     '    bibsource = {dblp computer science bibliography, '
                     'https://dblp.org}\n'
                     '}',
    'source-nickname': 'DuDQN',
    'source-title': 'Dueling Network Architectures for Deep Reinforcement '
                    'Learning'
}

Entry Structure

Here is the format of every entry:

{
    # BASICS
    "source-title": "",
    "source-nickname": "",
    "source-authors": [],

    # MISC.
    "source-bibtex": "",

    # ALGORITHM
    "algo-title": "",
    "algo-nickname": "",
    "algo-source-title": "",

    # SCORE
    "env-title": "",
    "score": 0,
}
  • source-title is the full title of the source of the score: it can be the title of the paper or GitHub repository title. source-nickname is a popular nickname or acronym for that title if it exists, otherwise it is the same as source-title.
  • source-authors are a list of authors or contributors.
  • source-bibtex is a BibTeX-format citation.
  • algo-title is the full title of the algorithm used. algo-nickname is the nickname or acronym for that algorithm if it exists, otherwise it is the same as algo-nickname.
  • algo-source-title is the title of the source of the algorithm. It can and often is different from source-title.

For example, the Space Invaders score of Asynchronous Advantage Actor Critic (A3C) algorithm in the Noisy Networks for Exploration (NoisyNet) paper is represented by the following entry:

{
    #  BASICS
    "source-title": "Noisy Networks for Exploration",
    "source-nickname": "NoisyNet",
    "source-authors": [
        "Meire Fortunato",
        "Mohammad Gheshlaghi Azar",
        "Bilal Piot",
        "Jacob Menick",
        "Ian Osband",
        "Alex Graves",
        "Vlad Mnih",
        "Remi Munos",
        "Demis Hassabis",
        "Olivier Pietquin",
        "Charles Blundell",
        "Shane Legg",
    ],

    #  ARXIV
    "source-arxiv-id": "1706.10295",
    "source-arxiv-version": 2,

    #  MISC.
    "source-bibtex": """
@article{DBLP:journals/corr/FortunatoAPMOGM17,
    author    = {Meire Fortunato and
                 Mohammad Gheshlaghi Azar and
                 Bilal Piot and
                 Jacob Menick and
                 Ian Osband and
                 Alex Graves and
                 Vlad Mnih and
                 R{\'{e}}mi Munos and
                 Demis Hassabis and
                 Olivier Pietquin and
                 Charles Blundell and
                 Shane Legg},
    title     = {Noisy Networks for Exploration},
    journal   = {CoRR},
    volume    = {abs/1706.10295},
    year      = {2017},
    url       = {http://arxiv.org/abs/1706.10295},
    archivePrefix = {arXiv},
    eprint    = {1706.10295},
    timestamp = {Mon, 13 Aug 2018 16:46:11 +0200},
    biburl    = {https://dblp.org/rec/bib/journals/corr/FortunatoAPMOGM17},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}""",

    # ALGORITHM
    "algo-title": "Asynchronous Advantage Actor Critic",
    "algo-nickname": "A3C",
    "algo-source-title": "Asynchronous Methods for Deep Reinforcement Learning",

    # HYPERPARAMETERS
    "algo-frames": 320 * 1000 * 1000,  # Number of frames

    # SCORE
    "env-title": "atari-space-invaders",
    "env-variant": "No-op start",
    "score": 1034,
    "stddev": 49,
}

Note that, as shown here, the entry can contain additional information.

Sources

Papers

Deep Q-Networks

Policy Gradients

Exploration

Misc.

Repositories