Roulette - more than a metric.

Roulette is a unified way to evaluate Machine Learning models. At its core, Roulette is using a Monte Carlo Simulation (=MCS) to estimate the risks in deploying a ML model to real world. The results of the MCS are aggregated using Wasserstein Distance (=WD) and result with two metrics:

  1. Distinguishability: a measure of accuracy = by how much the model is better than the data-mid-point: mean / most common value. value is in the range [0,1]

  2. Certainty: a measure that of consistency = by how much the model prediction are consistent over different samples of the data. value is >1, higher is better.


Roulette is hosted on PyPi, install using pip

pip install roulette-ml


We demonstrate the use of regression builder, binary calssification is reletavely similar.

Loading data

Roulette works with a single dataframe, with all the features and the target.

from sklearn.datasets import load_boston
import pandas as pd

boston = load_boston()
data = pd.DataFrame(
data.columns = boston.feature_names
data['PRICE'] =

Loading Roulette

builder = RegressionBuilder(

Building model
builder.result # will return a dictionary {'discriminability': 0.8840, 'certainty': 8.245}
builder.finalize_model() # runs a model build on the entire dataset # will create a local artifact on 'path_to_model_file/builder'