LaPros works with classifiers. It ranks the suspicious labels given
probabilies by some classification model. You can use normal Python
lists, Numpy arrays or Pandas data. Return values are in a Numpy array
or a Pandas series, the larger the value, the more suspicious are the
coresponding labels.
Rank the suspicious labels given probas from a classifier. Accept Numpy
arrays, Pandas dataframes and series. We can use interger, string or
even float labels, given that the probability matrix’s columns are
indexed by the same label set.
Args
probas (n x m matrix): probabilites for possible classes.
KwArgs
labels (n x 1 vector): observed class labels
rank_method (str): residual or confidence
return_non_errors (bool, default = True): return all rows, including
non-errors
Returns
a Pandas DataFrame including 1 index and 2 columns:
id (int): the index which is the same to the original data row index
err (float): the magnitude of suspiciousness, valued between [0, 1]
suspected (bool): whether the data row is suspected as having a label
error. This collum is returned iff return_non_errors=True.
help(suspect)
Help on function suspect in module lapros.api:
suspect(...)
Rank the suspicious labels given probas from a classifier.
Accept Numpy arrays, Pandas dataframes and series.
We can use interger, string or even float labels, given that
the probability matrix's columns are indexed by the same label set.
#### Args
- probas (n x m matrix): probabilites for possible classes.
#### KwArgs
- labels (n x 1 vector): observed class labels
- rank_method (str): `residual` or `confidence`
- return_non_errors (bool, default = True): return all rows, including non-errors
#### Returns
a Pandas DataFrame including 1 index and 2 columns:
- id (int): the index which is the same to the original data row index
- err (float): the magnitude of suspiciousness, valued between [0, 1]
- suspected (bool): whether the data row is suspected as having a label error. This collum is returned iff return_non_errors=True.
The Tidelift Subscription provides access to a continuously curated stream of human-researched and maintainer-verified data on open source packages and their licenses, releases, vulnerabilities, and development practices.