HydraFlow seamlessly integrates Hydra and MLflow to streamline machine learning experiment workflows. By combining Hydra's powerful configuration management with MLflow's robust experiment tracking, HydraFlow provides a comprehensive solution for defining, executing, and analyzing machine learning experiments.
HydraFlow is built on the following design principles:
- Type Safety - Utilizing Python dataclasses for configuration type checking and IDE support
- Reproducibility - Automatically tracking all experiment configurations for fully reproducible experiments
- Analysis Capabilities - Providing powerful APIs for easily analyzing experiment results
- Workflow Integration - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking
- Type-safe Configuration Management - Define experiment parameters using Python dataclasses with full IDE support and validation
- Seamless Hydra-MLflow Integration - Automatically register configurations with Hydra and track experiments with MLflow
- Advanced Parameter Sweeps - Define complex parameter spaces using extended sweep syntax for numerical ranges, combinations, and SI prefixes
- Workflow Automation - Create reusable experiment workflows with YAML-based job definitions
- Powerful Analysis Tools - Filter, group, and analyze experiment results with type-aware APIs
- Custom Implementation Support - Extend experiment analysis with domain-specific functionality
pip install hydraflow
Requirements: Python 3.13+
from dataclasses import dataclass
from mlflow.entities import Run
import hydraflow
@dataclass
class Config:
width: int = 1024
height: int = 768
@hydraflow.main(Config)
def app(run: Run, cfg: Config) -> None:
# Your experiment code here
print(f"Running with width={cfg.width}, height={cfg.height}")
if __name__ == "__main__":
app()
Execute a parameter sweep with:
python app.py -m width=800,1200 height=600,900
HydraFlow consists of the following key components:
Define type-safe configurations using Python dataclasses:
@dataclass
class Config:
learning_rate: float = 0.001
batch_size: int = 32
epochs: int = 10
The @hydraflow.main
decorator integrates Hydra and MLflow:
@hydraflow.main(Config)
def train(run: Run, cfg: Config) -> None:
# Your experiment code
Define reusable experiment workflows in YAML:
jobs:
train_models:
run: python train.py
sets:
- each: model=small,medium,large
all: learning_rate=0.001,0.01,0.1
Analyze experiment results with powerful APIs:
from hydraflow import Run, iter_run_dirs
# Load runs
runs = Run.load(iter_run_dirs("mlruns"))
# Filter and analyze
best_runs = runs.filter(model_type="transformer").to_frame("learning_rate", "accuracy")
For detailed documentation, visit our documentation site:
- Getting Started - Installation and core concepts
- Practical Tutorials - Learn through hands-on examples
- User Guide - Detailed documentation of HydraFlow's capabilities
- API Reference - Complete API documentation
This project is licensed under the MIT License.