actarius
Opinionated wrappers for the mlflow tracking API.
from actarius import NewExperimentRun
Contents
1 Features
actarius
is meant to facilitate the way we log mlflow
experiments in BigPanda, which means the following additions over the mlflow
tracking API:
- Automatically logging
stdout
andstderr
to file (without hiding them from the terminal/console) and logging this file an easilly readable artifact of the experiment. This supports nested experiment run contexts. - Adding a bunch of default tags (currently focused around
git
). - Convenience logging methods for dataframes as CSVs, and of arbitrary Python objects as either Pickle or text files (the latter using their inherent text represention).
- Warning but not erroring when mlflow is badly- or not configured.
2 Installation
pip install actarius
3 Use
actarius
provides a custom context manager that wraps around MLflow code to help you run and track experiments using BigPanda's conventions.
This context manager should be provided with some basic parameters that configure which experiment is being run:
from actarius import ExperimentRunContext, log_df
expr_databricks_path = 'Shared/experiments/pattern_generation/run_org'
with ExperimentRunContext(expr_databricks_path):
mlflow.set_tags({'some_tag': 45})
mlflow.log_params({'alpha': 0.5, 'beta': 0.2})
# run experiment code...
mlflow.log_metrics({'auc': 0.71, 'stability': 33.43})
log_df(my_df)
actarius
also provides an experiment object that needs to be closed explicitly:
from actarius import ExperimentRun
expr_databricks_path = 'Shared/experiments/pattern_generation/run_org'
exp_obj = ExperimentRun(expr_databricks_path)
exp_obj.set_tags({'some_tag': 45})
exp_obj.log_params({'alpha': 0.5, 'beta': 0.2})
# run experiment code...
exp_obj.log_df(my_df)
exp_obj.end_run(
tags={'another_tag': 'test'},
params={'log_param_here': 4},
metrics={'auc': 0.71, 'stability': 33.43},
)
4 Configuration
actarius
will fail silently if either mlflow
or the databricks cli is not correctly configured. It will issue a small warning on each experiment logging attempt, however (each closing of an experiment context, and each explicit call to an end_run()
method of an actarius.ExperimentRun
object).
Additionally, in this case experiment results will be logged into the ./mlruns/
directory (probably to the ./mlruns/0/
subdirectory), with random run ids determined and used to create per-run sub-directories.
To have the stack trace of the underlying error printed after the warning, simply set the value of the ACTARIUS__PRINT_STACKTRACE
environment variable to True
. Runing will then commence regularly.
5 Contributing
5.1 Installing for development
Clone:
git clone git@github.com:bigpandaio/actarius.git
Install in development mode, including test dependencies:
cd actarius
pip install -e '.[test]'
5.2 Running the tests
To run the tests use:
cd actarius
pytest
5.3 Adding documentation
The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.
Additionally, if you update this README.rst
file, use python setup.py checkdocs
to validate it compiles.
6 Credits
Created by Shay Palachy (shay.palachy@gmail.com).