Bgolearn

Report | Homepage | BgoFace UI

🤝🤝🤝 Please star ⭐️ this project to support open-source development! For questions or collaboration, contact: Dr. Bin Cao (bcao686@connect.hkust-gz.edu.cn)

📊 Usage Statistics (pepy)

🎓 Overview

Bgolearn is a lightweight and extensible Python package for Bayesian global optimization, built for accelerating materials discovery and design. It provides out-of-the-box support for regression and classification tasks, implements various acquisition strategies, and offers a seamless pipeline for virtual screening, active learning, and multi-objective optimization.

📦 Official PyPI: pip install Bgolearn 🎥 Code tutorial (BiliBili): Watch here 🚀 Colab Demo: Run it online

📈 Download Statistics

✨ Key Features

✅ One-Line Installation

pip install Bgolearn

✅ Update to Latest Version

pip install --upgrade Bgolearn

✅ Quick Check

pip show Bgolearn

🧪 Getting Started

import Bgolearn.BGOsampling as BGOS
import pandas as pd

# Load characterized dataset
data = pd.read_csv('data.csv')
x = data.iloc[:, :-1]   # features
y = data.iloc[:, -1]    # response

# Load virtual samples
vs = pd.read_csv('virtual_data.csv')

# Instantiate and run model
Bgolearn = BGOS.Bgolearn()
Mymodel = Bgolearn.fit(data_matrix=x, Measured_response=y, virtual_samples=vs)

# Get result using Expected Improvement
Mymodel.EI()

🔧 Multi-Objective Optimization

Install the extension toolkit:

pip install BgoKit

from BgoKit import ToolKit

Model = ToolKit.MultiOpt(vs, [score_1, score_2])
Model.BiSearch()
Model.plot_distribution()

📓 See detailed demo: Multi-objective Example

🧠 Supported Algorithms

🔹 For Regression

Expected Improvement (EI)
Augmented Expected Improvement (AEI)
Expected Quantile Improvement (EQI)
Upper Confidence Bound (UCB)
Probability of Improvement (PI)
Predictive Entropy Search (PES)
Knowledge Gradient (KG)
Reinterpolation EI (REI)
Expected Improvement with Plugin

🔹 For Classification

Least Confidence
Margin Sampling
Entropy-based approach

🖥️ User Interface

The graphical frontend of Bgolearn is developed as BgoFace, providing no-code access to its backend algorithms.

📚 Technical Innovations

🧩 Rich Bayesian Acquisition Functions

Supports a broad range of acquisition strategies (EI, UCB, KG, PES, etc.) for both single and multi-objective optimization. Works well with sparse and high-dimensional datasets common in material science.

🤝 Multi-Objective Expansion

Use BgoKit and MultiBgolearn to implement Pareto optimization across multiple target properties (e.g., strength & ductility), enabling parallel evaluation across virtual samples.

🔄 Integrated Active Learning

Incorporates adaptive sampling in an active learning loop—experiment → prediction → update—to accelerate optimization using fewer experiments.

📌 Academic Impact

2025

Nano Letters: Self-Driving Laboratory under UHV Link
Small: ML-Engineered Nanozyme System for Anti-Tumor Therapy Link
Computational Materials Science: Mg-Ca-Zn Alloy Optimization Link
Measurement: Foaming Agent Optimization in EPB Shield Construction Link
Intelligent Computing: Metasurface Design via Bayesian Learning Link

2024

Materials & Design: Lead-Free Solder Alloys via Active Learning Link
npj Computational Materials: MLMD Platform with Bgolearn Backend Link

📦 License

Released under the MIT License. 💼 Free for academic and commercial use. Please cite relevant publications if used in research.

🤝 Contributing & Collaboration

We welcome community contributions and research collaborations:

Submit issues for bug reports, ideas, or suggestions
Submit pull requests for code contributions
Contact Bin Cao (bcao686@connect.hkust-gz.edu.cn) for collaborations

Signature:
Bgolearn.fit(
    data_matrix,
    Measured_response,
    virtual_samples,
    Mission='Regression',
    Classifier='GaussianProcess',
    noise_std=None,
    Kriging_model=None,
    opt_num=1,
    min_search=True,
    CV_test=False,
    Dynamic_W=False,
    seed=42,
)

================================================================

:param data_matrix: data matrix of training dataset, X .

:param Measured_response: response of tarining dataset, y.

:param virtual_samples: designed virtual samples.

:param Mission: str, default 'Regression', the mission of optimization.  Mission = 'Regression' or 'Classification'

:param Classifier: if  Mission == 'Classification', classifier is used.
        if user isn't applied one, Bgolearn will call a pre-set classifier.
        default, Classifier = 'GaussianProcess', i.e., Gaussian Process Classifier.
        five different classifiers are pre-setd in Bgolearn:
        'GaussianProcess' --> Gaussian Process Classifier (default)
        'LogisticRegression' --> Logistic Regression
        'NaiveBayes' --> Naive Bayes Classifier
        'SVM' --> Support Vector Machine Classifier
        'RandomForest' --> Random Forest Classifier

:param noise_std: float or ndarray of shape (n_samples,), default=None
        Value added to the diagonal of the kernel matrix during fitting.
        This can prevent a potential numerical issue during fitting, by
        ensuring that the calculated values form a positive definite matrix.
        It can also be interpreted as the variance of additional Gaussian.
        measurement noise on the training observations.

        if noise_std is not None, a noise value will be estimated by maximum likelihood
        on training dataset.

:param Kriging_model (default None):
        str, Kriging_model = 'SVM', 'RF', 'AdaB', 'MLP'
        The  machine learning models will be implemented: Support Vector Machine (SVM), 
        Random Forest(RF), AdaBoost(AdaB), and Multi-Layer Perceptron (MLP).
        The estimation uncertainity will be determined by Boostsrap sampling.
    or  
        a user defined callable Kriging model, has an attribute of <fit_pre>
        if user isn't applied one, Bgolearn will call a pre-set Kriging model
        atribute <fit_pre> : 
        input -> xtrain, ytrain, xtest ; 
        output -> predicted  mean and std of xtest

        e.g. (take GaussianProcessRegressor in sklearn):
        class Kriging_model(object):
            def fit_pre(self,xtrain,ytrain,xtest):
                # instantiated model
                kernel = RBF()
                mdoel = GaussianProcessRegressor(kernel=kernel).fit(xtrain,ytrain)
                # defined the attribute's outputs
                mean,std = mdoel.predict(xtest,return_std=True)
                return mean,std    

        e.g. (MultiModels estimations):
        class Kriging_model(object):
            def fit_pre(self,xtrain,ytrain,xtest):
                # instantiated model
                pre_1 = SVR(C=10).fit(xtrain,ytrain).predict(xtest) # model_1
                pre_2 = SVR(C=50).fit(xtrain,ytrain).predict(xtest) # model_2
                pre_3 = SVR(C=80).fit(xtrain,ytrain).predict(xtest) # model_3
                model_1 , model_2 , model_3  can be changed to any ML models you desire
                # defined the attribute's outputs
                stacked_array = np.vstack((pre_1,pre_2,pre_3))
                means = np.mean(stacked_array, axis=0)
                std = np.sqrt(np.var(stacked_array), axis=0)
                return mean, std    

:param opt_num: the number of recommended candidates for next iteration, default 1. 

:param min_search: default True -> searching the global minimum ;
                           False -> searching the global maximum.

:param CV_test: 'LOOCV' or an int, default False (pass test) 
        if CV_test = 'LOOCV', LOOCV will be applied,
        elif CV_test = int, e.g., CV_test = 10, 10 folds cross validation will be applied.

:return: 1: array; potential of each candidate. 2: array/float; recommended candidate(s).
File:      ~/miniconda3/lib/python3.9/site-packages/Bgolearn/BGOsampling.py
Type:      method

BgoKit
Release 0.0.5

Release 0.0.5

0.0.6

0.0.5

0.0.4

0.0.3.2

0.0.3.1

0.0.3

0.0.2

0.0.1

Documentation

Bgolearn

🎓 Overview

📈 Download Statistics

✨ Key Features

✅ One-Line Installation

✅ Update to Latest Version

✅ Quick Check

🧪 Getting Started

🔧 Multi-Objective Optimization

🧠 Supported Algorithms

🔹 For Regression

🔹 For Classification

🖥️ User Interface

📚 Technical Innovations

🧩 Rich Bayesian Acquisition Functions

🤝 Multi-Objective Expansion

🔄 Integrated Active Learning

📌 Academic Impact

2025

2024

📦 License

🤝 Contributing & Collaboration

Stats

Releases

Contributors

BgoKit Release 0.0.5

Release 0.0.5 Toggle Dropdown 0.0.6 0.0.5 0.0.4 0.0.3.2 0.0.3.1 0.0.3 0.0.2 0.0.1

Documentation

Bgolearn

🎓 Overview

📈 Download Statistics

✨ Key Features

✅ One-Line Installation

✅ Update to Latest Version

✅ Quick Check

🧪 Getting Started

🔧 Multi-Objective Optimization

🧠 Supported Algorithms

🔹 For Regression

🔹 For Classification

🖥️ User Interface

📚 Technical Innovations

🧩 Rich Bayesian Acquisition Functions

🤝 Multi-Objective Expansion

🔄 Integrated Active Learning

📌 Academic Impact

2025

2024

📦 License

🤝 Contributing & Collaboration

Stats

Releases

Contributors

BgoKit
Release 0.0.5

Release 0.0.5

0.0.6

0.0.5

0.0.4

0.0.3.2

0.0.3.1

0.0.3

0.0.2

0.0.1