actarius

Opinionated wrappers for the mlflow tracking API.

from actarius import NewExperimentRun

Contents

1 Features
2 Installation
3 Use
4 Configuration
5 Contributing
6 Credits

1 Features

actarius is meant to facilitate the way we log mlflow experiments in BigPanda, which means the following additions over the mlflow tracking API:

Automatically logging stdout and stderr to file (without hiding them from the terminal/console) and logging this file an easilly readable artifact of the experiment. This supports nested experiment run contexts.
Adding a bunch of default tags (currently focused around git).
Convenience logging methods for dataframes as CSVs, and of arbitrary Python objects as either Pickle or text files (the latter using their inherent text represention).
Warning but not erroring when mlflow is badly- or not configured.

2 Installation

pip install actarius

3 Use

actarius provides a custom context manager that wraps around MLflow code to help you run and track experiments using BigPanda's conventions.

This context manager should be provided with some basic parameters that configure which experiment is being run:

from actarius import ExperimentRunContext, log_df
expr_databricks_path = 'Shared/experiments/pattern_generation/run_org'
with ExperimentRunContext(expr_databricks_path):
  mlflow.set_tags({'some_tag': 45})
  mlflow.log_params({'alpha': 0.5, 'beta': 0.2})
  # run experiment code...
  mlflow.log_metrics({'auc': 0.71, 'stability': 33.43})
  log_df(my_df)

actarius also provides an experiment object that needs to be closed explicitly:

from actarius import ExperimentRun
expr_databricks_path = 'Shared/experiments/pattern_generation/run_org'
exp_obj = ExperimentRun(expr_databricks_path)
exp_obj.set_tags({'some_tag': 45})
exp_obj.log_params({'alpha': 0.5, 'beta': 0.2})
# run experiment code...
exp_obj.log_df(my_df)
exp_obj.end_run(
  tags={'another_tag': 'test'},
  params={'log_param_here': 4},
  metrics={'auc': 0.71, 'stability': 33.43},
)

actarius will fail silently if either mlflow or the databricks cli is not correctly configured. It will issue a small warning on each experiment logging attempt, however (each closing of an experiment context, and each explicit call to an end_run() method of an actarius.ExperimentRun object).

Additionally, in this case experiment results will be logged into the ./mlruns/ directory (probably to the ./mlruns/0/ subdirectory), with random run ids determined and used to create per-run sub-directories.

To have the stack trace of the underlying error printed after the warning, simply set the value of the ACTARIUS__PRINT_STACKTRACE environment variable to True. Runing will then commence regularly.

5 Contributing

5.1 Installing for development

Clone:

git clone git@github.com:bigpandaio/actarius.git

Install in development mode, including test dependencies:

cd actarius
pip install -e '.[test]'

5.2 Running the tests

To run the tests use:

cd actarius
pytest

5.3 Adding documentation

The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.

Additionally, if you update this README.rst file, use python setup.py checkdocs to validate it compiles.

6 Credits

Created by Shay Palachy (shay.palachy@gmail.com).

actarius
Release 0.0.2

Release 0.0.2

0.0.2

Documentation

actarius

1 Features

2 Installation

3 Use

4 Configuration

5 Contributing

5.1 Installing for development

5.2 Running the tests

5.3 Adding documentation

6 Credits

Stats

Development practices

Releases

Contributors

actarius Release 0.0.2

Release 0.0.2 Toggle Dropdown 0.0.2

Documentation

actarius

Stats

Development practices

Releases

Contributors

actarius
Release 0.0.2

Release 0.0.2

0.0.2