State-of-the art Automated Machine Learning python library for Tabular Data


Keywords
auto-ml, automatic-machine-learning, automl, cross-validation, data-science, data-science-projects, hyperparameter-optimization, hyperparameter-tuning, machine-learning, machine-learning-library, machine-learning-models, ml, model-selection, optimisation, python, sklearn, stacking, stacking-ensemble, xgboost
License
MIT
Install
pip install automl-alex==2023.3.11

Documentation

AutoML Alex

Downloads PyPI - Python Version PyPI CodeFactor Telegram License


State-of-the art Automated Machine Learning python library for Tabular Data

Works with Tasks:

  • Binary Classification

  • Regression

  • Multiclass Classification (in progress...)

Benchmark Results

bench

The bigger, the better
From AutoML-Benchmark

Scheme

scheme

Features

  • Automated Data Clean (Auto Clean)
  • Automated Feature Engineering (Auto FE)
  • Smart Hyperparameter Optimization (HPO)
  • Feature Generation
  • Feature Selection
  • Models Selection
  • Cross Validation
  • Optimization Timelimit and EarlyStoping
  • Save and Load (Predict new data)

Installation

pip install automl-alex

Docs

DocPage

🚀 Examples

Classifier:

from automl_alex import AutoMLClassifier

model = AutoMLClassifier()
model.fit(X_train, y_train, timeout=600)
predicts = model.predict(X_test)

Regression:

from automl_alex import AutoMLRegressor

model = AutoMLRegressor()
model.fit(X_train, y_train, timeout=600)
predicts = model.predict(X_test)

DataPrepare:

from automl_alex import DataPrepare

de = DataPrepare()
X_train = de.fit_transform(X_train)
X_test = de.transform(X_test)

Simple Models Wrapper:

from automl_alex import LightGBMClassifier

model = LightGBMClassifier()
model.fit(X_train, y_train)
predicts = model.predict_proba(X_test)

model.opt(X_train, y_train,
    timeout=600, # optimization time in seconds,
    )
predicts = model.predict_proba(X_test)

More examples in the folder ./examples:

What's inside

It integrates many popular frameworks:

  • scikit-learn
  • XGBoost
  • LightGBM
  • CatBoost
  • Optuna
  • ...

Works with Features

  • Categorical Features

  • Numerical Features

  • Binary Features

  • Text

  • Datetime

  • Timeseries

  • Image

Note

  • With a large dataset, a lot of memory is required! Library creates many new features. If you have a large dataset with a large number of features (more than 100), you may need a lot of memory.

Realtime Dashboard

Works with optuna-dashboard

Dashboard

Run

$ optuna-dashboard sqlite:///db.sqlite3

Road Map

  • Feature Generation

  • Save/Load and Predict on New Samples

  • Advanced Logging

  • Add opt Pruners

  • Docs Site

  • DL Encoders

  • Add More libs (NNs)

  • Multiclass Classification

  • Build pipelines

Contact

Telegram Group