The RADICAL pilot job framework


Keywords
radical, pilot, job, hacktoberfest
License
MIT
Install
pip install radical.pilot==1.52.0

Documentation

RADICAL-Pilot (RP)

Build Status Documentation Status codecov OpenSSF Best Practices

RADICAL-Pilot (RP) executes heterogeneous tasks with maximum concurrency and at scale. RP can concurrently execute up to $10^5$ heterogeneous tasks, including single/multi core/GPU and MPI/OpenMP. Tasks can be stand-alone executables or Python functions and both types of task can be concurrently executed.

RP is a Pilot system, i.e., it separates resource acquisition from using those resources to execute application tasks. RP acquires resources by submitting a job to an HPC platform, and it can directly schedule and launch computational tasks on those resources. Thus, tasks are directly scheduled on the acquired resources, not via the batch system of the HPC platform. RP supports concurrently using single/multiple pilots on single/multiple high performance computing (HPC) platforms.

RP is written in Python and exposes a simple yet powerful API. In 15 lines of code, you can execute an arbitrary number of executables with maximum concurrency on a Linux container or, by changing resource, on one of the supported HPC platforms.

import radical.pilot as rp

# Create a session
session = rp.Session()

# Create a pilot manager and a pilot
pmgr    = rp.PilotManager(session=session)
pd_init = {'resource': 'local.localhost',
           'runtime' : 30,
           'cores'   : 4}
pdesc   = rp.PilotDescription(pd_init)
pilot   = pmgr.submit_pilots(pdesc)

# Crate a task manager and describe your tasks
tmgr = rp.TaskManager(session=session)
tmgr.add_pilots(pilot)
tds = list()
for i in range(8):
    td = rp.TaskDescription()
    td.executable     = 'sleep'
    td.arguments      = ['10']
    tds.append(td)

# Submit your tasks for execution
tmgr.submit_tasks(tds)
tmgr.wait_tasks()

# Close your session
session.close(cleanup=True)

Quick Start

Run RP's quick start tutorial directly on Binder. No installation needed.

After going through the tutorial, install RP and start to code your application:

python -m venv ~/.ve/radical-pilot
. ~/.ve/radical-pilot/bin/activate
pip install radical.pilot

Note that other than venv, you can also use virtualenv, conda or spack.

For some inspiration, see our RP application examples, starting from 00_getting_started.py .

Documentation

RP user documentation uses Sphinx, and it is published on Read the Docs.

RP tutorials can be run via Binder.

Developers

RP development uses Git and GitHub. RP requires Python3, a virtual environment and a GNU/Linux OS. Clone, install and test RP:

python -m venv ~/.ve/rp-docs
. ~/.ve/rp-docs/bin/activate
git clone git@github.com:radical-cybertools/radical.pilot.git
cd radical.pilot
pip install -r requirements-docs.txt
sphinx-build -M html docs/source/ docs/build/

RP documentation uses tutorials coded as Jupyter notebooks. Sphinx and nbsphinx run RP locally to execute those tutorials. Successful compilation of the documentation also serves as a validation of your local development environment.

Provide Feedback

Have a question, feature request or you found a bug? Feel free to open a support ticket. For vulnerabilities, please draft a private security advisory.

Contributing

We welcome everyone that wants to contribute to RP development. We are an open and welcoming community, committed to making participation a harassment-free experience for everyone. See our Code of Conduct, relevant technical documentation and feel free to get in touch.