package-outlier

This is an outlier detection package


Keywords
outlier-detection, pypi, zscore
License
MIT
Install
pip install package-outlier==0.9

Documentation

PyPI - Python Version PyPI - Implementation PyPI - Wheel PyPI GitHub issues GitHub forks GitHub stars GitHub license PyPI - Downloads

package outlier

This is pypi package for outlier detection

Outlier detection package

A Python based outlier detection package.

Install

Read the online Installation instructions.

This software depends on NumPy and Scipy, Python packages for scientific computing. You must have them installed prior to installing package-outlier.

Install the latest version of package-outlier

$ pip install package-outlier

This will display a message and download if the module is not already installed. It will then install package-outlier and all its dependencies. You may need root privileges to install setuptools.

Outlier detection by zscore

Zscore is a common method to detect anomaly in 1-D. For a given data point zscore is calculated by:
zscore = data_point - mean / std_dev

The function take data and threshold value as required argument and returns data points that are outliers.

Outlier detection by modified_zscore

Mean and standard deviation are themselves prone to outliers that's why we use median instead of mean and median absolute deviation instead of mean absolute deviation.
For more info on median absolute deviation refer to https://en.wikipedia.org/wiki/Median_absolute_deviation.

Parameters for package_outlier.zscore_and_anomaly_detection()

data : array like
An array like object containing the sample data

threshold : int

flag : int, optional
Default is 1. 1 for data points greater than threshold value and 0 for data points lesser than threshold value\

ddof: int, optional
Degrees of freedom correction in the calculation of the standard deviation. Default is 0.

nan_policy: {"propagate", "raise", "omit"}, optional
Defines how to handle when input contains nan. "propagate" returns nan, "raise" throws an error, "omit" performs the calculations ignoring nan values. Default is "propagate".

Returns:
zscore: array_like The z-scores, standardized by mean and standard deviation of input array. outliers: array_like These are points from data that are outliers.

Parameters for package_outlier.modified_zscore_and_anomaly_detection()

data : array like
An array like object containing the sample data

threshold : int

flag : int, optional
Default is 1. 1 for data points greater than threshold value and 0 for data points lesser than threshold value

consistency_correction: float, optional
Default is 1.4826. converts MAD to the standard deviation for a given distribution.

Returns:
modified_zscore: array_like
The modified-z-scores of input array.

mad: float
median_absolute_deviation

outliers: array_like
These are points from data that are outliers.

import package_outlier
import numpy as np

arr = [8,5,7,8,11,13,4,9,10,7,6]
arr = np.array(arr)

zscore, outliers = package_outlier.zscore_and_anomaly_detection(arr, 1)
print(zscore)
print(outliers)


modified_zscore, mad, outliers = package_outlier.modified_zscore_and_anomaly_detection(arr, 1)
print(modified_zscore)
print(mad)
print(outliers)

Contribute

You've discovered a bug or something else you want to change - excellent!

You've worked out a way to fix it – even better!

You want to tell us about it – best of all!

Start at the contributing guide!