Coral Reef Optimization (CRO) Algorithm


Keywords
optimization, algorithm, evolutionary, meta-heuristic, feature, selection, coral, reef
License
MIT
Install
pip install cro==0.0.5.1

Documentation

pypi Travis Supported Python versions

coral-reef-optimization-algorithm

Coral Reefs Optimization (CRO) algorithm artificially simulates a coral reef, where different corals (which are the solutions for the considered optimization problem) grow and reproduce in a coral-reef, fighting with other corals for space and find depredation.

CRO is based on a reef, consisting of a N x M square grid. It assumes that each square is able to allocate a coral (or colony of corals). As other evolutionary algorithms, is based on the fact that reef will progress, as long as healthier corals (better solutions at the problem) survive, while less stronger corals die. Also, as other genetic and evolutionary optimization algorithms, a phase of reproduction takes place. In this case, CRO applies several operators to imitate coral's reproduction:

  • Modelling of sexual reproduction (Broadcast Spawning and Brooding)
    • Broadcast Spawning consists of external reproduction, selecting couples of the pool of broadcast spawner corals (as i.e cross-over operator in a genetic algorithm)
    • Brooding is an internal reproduction (as mutation operator in a genetic algorithm)
  • Modelling of asexual reproduction (Budding)
    • Budding operator duplicates existing corals with a good level of healthiness (best solutions)
  • Also some catastrophic events as coral depredation

Flow diagram of the proposed CRO algorithm:

Publication link: The Coral Reefs Optimization Algorithm: A Novel Metaheuristic for Efficiently Solving Optimization Problems

CRO includes the following features:

  • Optimization algorithm using your own fitness functions, automatically defined functions
  • Focus on Feature selection problem, allowing any kind of machine learning model and metric (ie. scikit-learn)
  • Binary and Discrete modes as corals representation, grid values interval is allowed
  • Few external dependences, CRO uses Numpy as standard library in order to operate with arrays
  • Hall of Fame of the best corals in the reef

In future releases:

  • Parallelization of the fitness function evaluations
  • Continuous mode as corals representation
  • Add a benchmark module containing most common test functions
  • Multi-objective optimization, colony of corals in each grid position
  • Focus on ML hyperparameters optimization and feature selection as a multi-objective approach

Install

To install the library use pip:

pip install cro

or clone the repo and just type the following on your shell:

python setup.py install

Usage examples

Example of usage for max_ones problem. In this problem, max_ones problem, the health function is just the number of ones in the coral in percentage

The following results can be reproduced with command:

import numpy as np
import seaborn as sns 
from cro import *
from cro.fitness import max_ones
from cro.report import plot_results

## ------------------------------------------------------
## Parameters initialization
## ------------------------------------------------------
Ngen = 30                  # Number of generations
N  = 20                    # MxN: reef size
M  = 20                    # MxN: reef size
Fb = 0.7                   # Broadcast prob.
Fa = 0.1                   # Asexual reproduction prob.
Fd = 0.1                   # Fraction of the corals to be eliminated in the depredation operator.
r0 = 0.6                   # Free/total initial proportion
k  = 3                     # Number of opportunities for a new coral to settle in the reef
Pd = 0.1                   # Depredation prob.
opt= 'max'                 # flag: 'max' for maximizing and 'min' for minimizing
L = 100
ke = 0.2
## ------------------------------------------------------

cro = CRO(Ngen, N, M, Fb, Fa, Fd, r0, k, Pd, max_ones, opt, L, verbose=False, ke=ke)
%time (REEF, REEFpob, REEFfitness, ind_best, Bestfitness, Meanfitness) = cro.fit()
plot_results(Bestfitness, Meanfitness, cro)

Output:

Best coral:  [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Best solution: 100.0
Wall time: 1.06 s

Results for feature selection problem

This example uses a database which was created to identify a voice as male or female (binary target), based upon acoustic properties of the voice and speech. Originally, it contains 20 features and I added 10 noisy ones at the end. More examples in examples

from functools import partial
import numpy as np
import seaborn as sns 
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import roc_auc_score

from cro import *
from cro.utils import load_data
from cro.fitness import feature_selection


## ------------------------------------------------------
## Parameters initialization
## ------------------------------------------------------
Ngen = 20                  # Number of generations
N  = 10                    # MxN: reef size
M  = 10                    # MxN: reef size
Fb = 0.8                   # Broadcast prob.
Fa = 0.2                   # Asexual reproduction prob.
Fd = 0.1                   # Fraction of the corals to be eliminated in the depredation operator.
r0 = 0.6                   # Free/total initial proportion
k  = 3                     # Number of opportunities for a new coral to settle in the reef
Pd = 0.1                   # Depredation prob.
opt= 'max'                 # flag: 'max' for maximizing and 'min' for minimizing
## ------------------------------------------------------

dataset = load_data('voice')
L = dataset.data.shape[1] # number of features
X = dataset.data
y = dataset.target

clf = KNeighborsClassifier(2)

fitness_coral = partial(feature_selection, X=X, y=y, model=clf,
                        get_prediction = lambda clf, X: clf.predict_proba(X)[:, 1], 
                        metric=roc_auc_score)

cro = CRO(Ngen, N, M, Fb, Fa, Fd, r0, k, Pd, fitness_coral, opt, L, seed=13, verbose=True)
%time (REEF, REEFpob, REEFfitness, ind_best, Bestfitness, Meanfitness) = cro.fit(X, y, clf)

names = np.array(dataset.feature_names)
print(names[REEFpob[ind_best, :]>0])

Output:

Best coral:  [0 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
Best solution: 0.98179427934
Wall time: 35.9 s
['Q25' 'IQR' 'skew' 'centroid' 'meanfun' 'maxfun' 'mindom']

Folder structure

The following shows basic folder structure.

โ”œโ”€โ”€ cro 
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ cro.py 
โ”‚   โ”œโ”€โ”€ fitness.py
โ”‚   โ”œโ”€โ”€ larvaemutation.py
โ”‚   โ”œโ”€โ”€ utils.py
โ”‚   โ”œโ”€โ”€ tests.py
โ”‚   โ”œโ”€โ”€ assests 
โ”‚   |   โ”œโ”€โ”€ data
โ”‚   |   |	โ”œโ”€โ”€ voice.csv
โ”œโ”€โ”€ examples 
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ context.py
โ”‚   โ”œโ”€โ”€ example_basic.py
โ”‚   โ”œโ”€โ”€ example_advanced.py
โ”œโ”€โ”€ tests
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ test_cro.py
โ”‚   โ”œโ”€โ”€ test_fitness.py
โ”‚   โ”œโ”€โ”€ test_larvaemutation.py
โ”‚   โ”œโ”€โ”€ test_reefinitialization.py

Acknowledgements

This implementation has been based on Sancho Salcedo's idea and this project

Contact point: cro_developers@googlegroups.com

If you want to develop CRO library with us, we will be pleased to work with you on this project. Ask to be added to the email group.