data-complexity

The Data Complexity Measures in Python

Install

$ pip install data-complexity

How it works

Maximum Fisher's Discriminant Ratio (F1)

from dcm import dcm
from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target

index, F1 = dcm.F1(X, y)

Fraction of Borderline Points (N1)

from dcm import dcm
from sklearn import datasets

bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values

N1 = dcm.N1(X, y)

Entropy of Class Proportions (C1) and Imbalance Ratio (C2)

from dcm import dcm
from sklearn import datasets

bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values

C1, C2 = dcm.C12(X, y)

Other Measures

Coming soon...

References

[1] How Complex is your classification problem? A survey on measuring classification complexity, https://arxiv.org/abs/1808.03591

[2] The Extended Complexity Library (ECoL), https://github.com/lpfgarcia/ECoL

data-complexity
Release 0.0.5

Release 0.0.5

0.1.3

0.1.2

0.1.1

0.1.0

0.0.8

0.0.7

0.0.6

0.0.5

0.0.3

0.0.1

Documentation

data-complexity

Install

How it works

Maximum Fisher's Discriminant Ratio (F1)

Fraction of Borderline Points (N1)

Entropy of Class Proportions (C1) and Imbalance Ratio (C2)

Other Measures

References

Stats

Development practices

Releases

Contributors

data-complexity Release 0.0.5

Release 0.0.5 Toggle Dropdown 0.1.3 0.1.2 0.1.1 0.1.0 0.0.8 0.0.7 0.0.6 0.0.5 0.0.3 0.0.1

Documentation

data-complexity

Install

How it works

Maximum Fisher's Discriminant Ratio (F1)

Fraction of Borderline Points (N1)

Entropy of Class Proportions (C1) and Imbalance Ratio (C2)

Other Measures

References

Stats

Development practices

Releases

Contributors

data-complexity
Release 0.0.5

Release 0.0.5

0.1.3

0.1.2

0.1.1

0.1.0

0.0.8

0.0.7

0.0.6

0.0.5

0.0.3

0.0.1