cluster-over-sampling

Category	Tools
Development
Package
Documentation
Communication

Introduction

A general interface for clustering based over-sampling algorithms.

Installation

For user installation, cluster-over-sampling is currently available on the PyPi's repository, and you can install it via pip:

pip install cluster-over-sampling

Development installation requires to clone the repository and then use PDM to install the project as well as the main and development dependencies:

git clone https://github.com/georgedouzas/cluster-over-sampling.git
cd cluster-over-sampling
pdm install

SOM clusterer requires optional dependencies:

pip install cluster-over-sampling[som]

All the classes included in cluster-over-sampling follow the imbalanced-learn API using the functionality of the base oversampler. Using scikit-learn convention, the data are represented as follows:

Input data X: 2D array-like or sparse matrices.
Targets y: 1D array-like.

The clustering-based oversamplers implement a fit method to learn from X and y:

clustering_based_oversampler.fit(X, y)

They also implement a fit_resample method to resample X and y:

X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)

References

If you use cluster-over-sampling in a scientific publication, we would appreciate citations to any of the following papers:

cluster-over-sampling
Release 0.6.0

Release 0.6.0

0.2.6

0.3.1

0.3.0

0.6.0

0.5.0

0.4.2

0.4.1

0.4.0

0.2.5

0.2.4

Documentation

cluster-over-sampling

Introduction

Installation

Usage

References

Stats

Development practices

Releases

Contributors

cluster-over-sampling Release 0.6.0

Release 0.6.0 Toggle Dropdown 0.2.6 0.3.1 0.3.0 0.6.0 0.5.0 0.4.2 0.4.1 0.4.0 0.2.5 0.2.4

Documentation

cluster-over-sampling

Introduction

Installation

Usage

References

Stats

Development practices

Releases

Contributors

cluster-over-sampling
Release 0.6.0

Release 0.6.0

0.2.6

0.3.1

0.3.0

0.6.0

0.5.0

0.4.2

0.4.1

0.4.0

0.2.5

0.2.4