Manage neural experiments, based on your commit history


Keywords
coldbrew, experiments, git, git-workflows, neural-experiments, preparation, snapshot
License
Other
Install
pip install coldbrew==3.5

Documentation

冷萃 coldbrew ☕️🥶

Prepares and manages neural experiments without getting in the way

Goals

  1. Provide a simple way of transferring an immutable snapshot of the current code to a server
  2. Give an overview over active experiments with a simple web interface
  3. Automatically prepare experiment directories and render template strings which can then be directly pasted into a terminal to start the experiment
  4. Improve reproducibility: Store a code snapshot for every experiment which can be viewed right in the web interface

coldbrew does so without getting in the way, meaning: Everything is implemented using standard git workflows and there is no need to change existing code. There is also no database or system which needs to be maintained in the future.

coldbrew does not run any trainings directly; it just snapshots the code, transfers it to a server via git, prepares code and results directories and then returns command strings (based on previously defined templates) which can be pasted into a terminal to start the training.

How it works

flow

coldbrew works by creating a new branch in your local development git repository together with a separate coldbrew remote. When creating a snapshot, coldbrew will switch to the coldbrew branch, stage all changes, commit them, push to the special remote and then switch back to the branch you were working on while unstaging any files that were not staged before the snapshot.

In the web interface, you now have the option to prepare an experiment from this (and all previous) snapshot(s). Once you do, the server will checkout the corresponding code to a unique location (you can also check out the same snapshot multiple times). It will also create a new subdirectory in the results directory which is where your logs, model files etc. should go.

The experiment's code can be directly viewed from the web interface. Also, you can copy the code and results paths and filled versions of the template strings you defined. These are meant to be directly pasted into a terminal to start the training (or tensorboard, a test script or anything else).

Quickstart

  • Install from PyPI
$ pip install coldbrew
  • Navigate to the results directory on your training server (e.g. /hdd/experiments/) and initialize server repository there:
$ coldbrew server init
  • On your local development machine, setup your git repository for coldbrew:
$ coldbrew init
  • Start the server which runs the web interface:
$ coldbrew server -a my.host.org -p 12345
  • Start creating snapshots
$ coldbrew -d "Changed dataset loader" -g "Dataset Tests"

Web Interface

With the web interface, you can:

  • Prepare a new training: From the available code snapshots, checkout one into a code directory, create a results directory and return strings that can be directly pasted into a terminal to start the training
  • Get an overview over currently active experiments
  • Sort and filter by experiment description and date
  • Delete the code and results directories for experiments you don't need anymore
  • View the code snapshots for all experiments in an online viewer

Experiments Overview Experiment Details Experiment Preparation

Grouping

From version 1.2 onwards, you can now also group experiments together by supplying the -g parameter:

$ coldbrew -d "Changed dataset loader" -g "Dataset Tests"

Experiments of the same group will be placed in a common subdirectory and are visually separated in the web interface.
You can also change the group after preparation by moving the experiment result directories into different subdirectories.

Templating

You can provide templates that will be filled when preparing an experiment.

They can contain anything; the two variables {code_path} and {results_path} will be replaced with the respective path strings. To define templates, put them into a file named .coldbrew in the root directory of your project. If you want to add more templates to all experiments afterwards, you can also define them in a .coldbrew file in the root of the results directory.

A typical .coldbrew file would contain:

python {code_path}/train.py --results {results_path} --gpu 4
tensorboard --logdir={results_path}/logs/
python test.py --gpu 0 --input some_data.h5 --output {results_path}/evaluation/ --model {results_path}/models/epoch-best.pth

Which will be filled like this: Command Strings

Dependencies

  • Python >= 3.7
  • GitPython (managing the underlying git workflows)
  • flask (serving the webpage)
  • waitress (replacing flask's development server with a more stable alternative)

Misc

  • This tool was made for research purposes and has not been tested for production
  • The web interface allows you to view and delete files and has not been designed with security in mind. You should not run this on a public-facing server.