pybayesbandit

Bayesian bandits in Python3.

Quickstart

$ pip install pybayesbandit

Usage

$ pybayesbandit --help
usage: pybayesbandit [-h] [-p PARAMS [PARAMS ...]] [-d MAXDEPTH] [-t TRIALS]
                     [-C C] [-e EPISODES] [-hr HORIZON] [--plot] [-v]
                     {random,ucb,thompson,vi,uct,rollout,aotree} {bernoulli}
                     {total,simple}

Bayesian bandits in Python3.

positional arguments:
  {random,ucb,thompson,vi,uct,rollout,aotree}
                        learner type
  {bernoulli}           bandit type
  {total,simple}        game setting

optional arguments:
  -h, --help            show this help message and exit
  -p PARAMS [PARAMS ...], --params PARAMS [PARAMS ...]
                        bandit parameters
  -d MAXDEPTH, --maxdepth MAXDEPTH
                        maximum number of timesteps in the tree lookahead
                        (default=10)
  -t TRIALS, --trials TRIALS
                        number of trials in Monte-Carlo sampling (default=30)
  -C C                  UCT exploration constant (default=2.0)
  -e EPISODES, --episodes EPISODES
                        number of simulation episodes (default=200)
  -hr HORIZON, --horizon HORIZON
                        number of timesteps in each episode (default=100)
  --plot                plot cumulative regret
  -v, --verbose         verbose mode

Examples

$ pybayesbandit ucb bernoulli total -p 0.5 0.8 0.3 -e 100 -hr 50 -v

Running pybayesbandit ...
>> learner  = ucb
>> bandit   = bernoulli([0.5, 0.8, 0.3])
>> episodes = 100
>> horizon  = 50
Done in 0.257 sec.

Results:
>> Reward =  32.3900 ± 3.4173
>> Regret =   7.6530 ± 1.7081

$ pybayesbandit thompson bernoulli total -p 0.5 0.8 0.3 -e 100 -hr 50 -v

Running pybayesbandit ...
>> learner  = thompson
>> bandit   = bernoulli([0.5, 0.8, 0.3])
>> episodes = 100
>> horizon  = 50
Done in 0.297 sec.

Results:
>> Reward =  35.2200 ± 3.8822
>> Regret =   4.4560 ± 2.6086

$ pybayesbandit uct bernoulli total -p 0.5 0.8 0.3 -e 100 -hr 50 --trials 15 --maxdepth 5 -v

Running pybayesbandit ...
>> learner  = uct(trials=15, maxdepth=5, C=2.0)
>> bandit   = bernoulli([0.5, 0.8, 0.3])
>> episodes = 100
>> horizon  = 50
Done in 7.066 sec.

Results:
>> Reward =  36.2800 ± 5.4828
>> Regret =   3.4360 ± 4.5856

License

pybayesbandit is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

pybayesbandit is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with pybayesbandit. If not, see http://www.gnu.org/licenses/.

pybayesbandit
Release 0.0.4

Release 0.0.4

0.0.4

0.0.2

Documentation

pybayesbandit

Quickstart

Usage

Examples

License

Stats

Development practices

Releases

Contributors

pybayesbandit Release 0.0.4

Release 0.0.4 Toggle Dropdown 0.0.4 0.0.2

Documentation

pybayesbandit

Quickstart

Usage

Examples

License

Stats

Development practices

Releases

Contributors

pybayesbandit
Release 0.0.4

Release 0.0.4

0.0.4

0.0.2