imbutil

Additions to the imblearn package


License
MIT
Install
pip install imbutil==0.0.4

Documentation

imbutil

PyPI-Status PyPI-Versions Build-Status Codecov LICENCE

Additions to the imbalanced-learn package.

from imbutil.combine import MinMaxRandomSampler; from imblearn import pipeline;
# oversampling minority classes to 100 and undersampling majority classes to 800
sampler = MinMaxRandomSampler(min_freq=100, max_freq=800)
sampling_clf = pipeline.make_pipeline(sampler, inner_clf)

1   Installation

pip install imbutil

Additionally, the MinMaxRandomSampler, in addition to RandomUnderSampler and RandomOverSampler from imbalanced-learn, can technically be used with non-numeric data. However, the current implementation of imbalanced-learn forces a check for numeric data for all samplers. If you want to bypass this limitation, I have a fork of the project which does not force data to be numeric. You can install it with:

pip install git+https://github.com/shaypal5/imbalanced-learn.git@f6adc562fafdc2198931873799e725e5abdd65a1

2   Basic Use

imbutil additions addhere to the structure of the imblearn package:

2.1   combine

Containes samplers that both under-sample and over-sample:

MinMaxRandomSampler - Random samples data to bring all class frequencies into a range.

3   Contributing

Package author and current maintainer is Shay Palachy (shay.palachy@gmail.com); You are more than welcome to approach him for help. Contributions are very welcomed.

3.1   Installing for development

Clone:

git clone git@github.com:shaypal5/imbutil.git

Install in development mode, and with test dependencies:

cd imbutil
pip install -e ".[test]"

3.2   Running the tests

To run the tests use:

cd imbutil
pytest

3.3   Adding documentation

The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.

Additionally, if you update this README.rst file, use python setup.py checkdocs to validate it compiles.

4   Credits

Created by Shay Palachy (shay.palachy@gmail.com).