Random variables


Keywords
machine-learning, probability, statistics, bayesian
License
MIT

Documentation

rv

Random variables (RV) for rust.

docs.rs Crates.io License Crates.io MSRV GitHub Actions Workflow Status Codecov

rv offers basic functionality for many probability distributions. For example, if we wanted to perform a conjugate analysis of Bernoulli trials:

use rv::prelude::*;

// Prior over the unknown coin weight. Assume all weights are equally
// likely.
let prior = Beta::uniform();

// observations generated by a fair coin
let obs_fair: Vec<u8> = vec![0, 1, 0, 1, 1, 0, 1];

// observations generated by a coin rigged to always show heads. Note that
// we're using `bool`s here. Bernoulli supports multiple types.
let obs_fixed: Vec<bool> = vec![true; 6];

let data_fair: BernoulliData<_> = DataOrSuffStat::Data(&obs_fair);
let data_fixed: BernoulliData<_> = DataOrSuffStat::Data(&obs_fixed);

// Let's compute the posterior predictive probability (pp) of a heads given
// the observations from each coin.
let postpred_fair = prior.pp(&1u8, &data_fair);
let postpred_fixed = prior.pp(&true, &data_fixed);

// The probability of heads should be greater under the all heads data
assert!(postpred_fixed > postpred_fair);

// We can also get the posteriors
let post_fair: Beta = prior.posterior(&data_fair);
let post_fixed: Beta = prior.posterior(&data_fixed);

// And compare their means
let post_mean_fair: f64 = post_fair.mean().unwrap();
let post_mean_fixed: f64 = post_fixed.mean().unwrap();

assert!(post_mean_fixed > post_mean_fair);

Feature flags

  • serde1: enables serialization and de-serialization of structs via serde
  • process: Gives you access to Gaussian processes.
  • arraydist: Enables distributions and statistical tests that require the nalgebra crate.

Design

Random variables are designed to be flexible. For example, we don't just want a Beta distribution that works with f64; we want it to work with a bunch of things like

use rv::prelude::*;

// Beta(0.5, 0.5)
let beta = Beta::jeffreys();

let mut rng = rand::thread_rng();

// 100 f64 weights in (0, 1)
let f64s: Vec<f64> = beta.sample(100, &mut rng);
let pdf_x = beta.ln_pdf(&f64s[42]);

// 100 f32 weights in (0, 1)
let f32s: Vec<f32> = beta.sample(100, &mut rng);
let pdf_y = beta.ln_pdf(&f32s[42]);

// 100 Bernoulli distributions -- Beta is a prior on the weight
let berns: Vec<Bernoulli> = beta.sample(100, &mut rng);
let pdf_bern = beta.ln_pdf(&berns[42]);

For more interesting examples, including use in machine learning, see examples/.

Contributing

Bjork has had a great influence on how I create things. She once said in an interview:

When I did "Debut" I thought, 'OK, I've pleased enough people, I'm gonna get really selfish.' And I never sold as many records as with "Debut". So, I don't know, it seems the more selfish I am, the more generous I am. I'm not going to pretend I know the formula. I can only please myself.

And so our goal with rv is to please ourselves. We use it in our tools and we've designed it to do what we want. We are happy if you find rv useful, and we will entertain ideas for changes -- and accept them if we like them -- but in the end, rv is for us.

If you'd like to offer a contribution:

  1. Please create an issue before starting any work. We're far from stable, so we might actually be working on what you want, or we might be working on something that will change the way you might implement it.
  2. If you plan on implementing a new distribution, implement at least Rv, Support, and either ContinuousDistr or DiscreteDistr. Of course, more is better!
  3. Implement new distributions for the appropriate types. For example, don't just implement Rv<f64>, also implement Rv<f32>. Check out other distributions to see how it can be done easily with macros.
  4. Write tests, docs, and doc tests.
  5. Use rustfmt. We've included a .rustfmt.toml in the project directory.