FAICE
FAICE (Fair Collaboration and Experiments) is a tool suite, helping researchers to work with experiments published in the FAICE description format. The FAICE software is developed at CBMI (HTW Berlin - University of Applied Sciences)
Install
FAICE is a cross-platform software implemented in Python 3 and can be installed via Python's package manager pip. Make sure to use the Python 3 version of pip, usually referred to as pip3.
pip3 install --user --upgrade faice
# running faice should list available sub commands
faice
# alternative command if Python's script directory is not set in the PATH variable
python3 -m faice
Glossary
This glossary explains the basic vocabulary as used by the FAICE tools.
Execution Engine
An execution engine is able to run programs in a specified manner. The executed programs usually take some input files and parameters and produce result files. Support for arbitrary execution engines can be implemented in FAICE. Currently two execution engines, Curious Containers and CWLTool (Common Workflow Language), are available.
Remote Data Storage
A remote data storage provides files, which are downloadable via a network. Common protocols are HTTP and SFTP (SSH). The access to the resources may be restricted and require authentication.
Experiment
An experiment is a formal description of a program and how to invoke it with certain input files and parameters.
The experiment description can be redistributed to other researchers or published on a website (e.g. github), such that
results are easily reproducible with the faice
CLI tools.
FAICE experiments are JSON files, which may contain sensitive information like credentials for remote data storage or an
execution engine. Instead of publishing an experiment containing these secrets, they should be replaced by variables.
For example password: "SECRET"
can be replaced with a variable password: "{{data_password}}"
in double curly braces.
The syntax for variables is borrowed from the Python templating engine Jinja2 and other similar templating engines used
in various programming languages.
Usage
FAICE provides various tools via a common command line interface. The help commands faice ${tool} -h
show additional information about each tool.
# list available tools
faice
# run the specified experiment in an execution engine
faice run -h
# generate a Vagrantfile to launch the specified execution engine in a virtual machine
faice vagrant -h
Run
The faice run
tool runs the given experiment by sending the instructions contained in the experiment JSON file to
the specified execution engine.
Vagrant
The faice vagrant
tool can be used to set up a local execution engine. When using a local execution
engine, it is not necessary to know any secret credentials for online resources. This local execution engine runs in a
Vagrant virtual machine (VM), where all necessery configuration files are generated by faice vagrant
.
Requirements: The latest versions of Virtualbox and Vagrant should be installed on the system. Starting a Virtual
Machine uses a fixed amount of system resources like RAM and CPUs. The required resources are
declared in the experiments' JSON description as host_ram
and host_cpus
.
FAICE Description Format
The basic structure of the format is defined in the jsonschema language as follows:
{
"type": "object",
"properties": {
"format_version": {"enum": ["1"]},
"execution_engine": {
"type": "object",
"properties": {
"engine_type": {"enum": ["curious-containers", "common-workflow-language"]},
"engine_config": {"type": "object"}
},
"required": ["engine_type", "engine_config"],
"additionalProperties": false
},
"instructions": {},
"meta_data": {}
},
"required": ["format_version", "execution_engine", "instructions", "meta_data"],
"additionalProperties": false
}
For example experiments take a look at the BDCAT2017 TDS Experiment github repository. The main fields contained in the JSON document are described below:
format_version
The format_version
field exists for future backwards compatibility. For now, the value of this field is always '1'.
execution_engine
The execution_engine
field contains an object with engine_type
and engine_config
fields, where engine_type
must
be set to one of the available execution engines. The engine_config
field gives precise information about the
execution engine itself (like install requirements). The exact content of engine_config
is defined by engine
integration in the faice software.
instructions
The instructions
field contains a technical description of how to run the experiment with the specified execution
engine. This means that instructions
contains whatever is passed to the execution engine (e.g. a task in JSON format
for Curious Containers).
meta_data
While the instructions
field contains a technical description of how to run an experiment, it may lack further
meta_data, which are useful for humans. This meta_data
complements whatever data is already present in instructions
and varies from execution engine to execution engine. Some of the meta data are displayed to the user by the faice
CLI tools.