canomaly
Project Description
This package detects specific types of anomalies with an emphasis in looking for cumulative changes.
Installation
This package can be installed through PyPi using
pip install canomaly
or
pip3 install canomaly
Example Usage
>>> import pandas as pd
>>> from canomaly.searchtools import cumrexpy
>>> # Get some data
>>> data = {
'date': [
'2018-11-20',
'2018-11-21',
'2018-11-22',
'2018-11-22',
'2018-11-23',
'2018-11-24'],
'email': [
'john.doe@example.com',
'jane.smith@example.com',
'bob-johnson_123@example.com',
'sarah@mydomain.co.uk',
'frank@mydomain.com',
'jessica_lee@mydomain.com'
]
}
>>> df = pd.DataFrame(data)
>>> df['date'] = pd.to_datetime(df['date'])
>>> # Extract regular expressions
>>> cumrexpy(df, 'email', 'date')
date
2018-11-20 [^john\.doe@example\.com$]
2018-11-21 [^[a-z]{4}\.[a-z]{3,5}@example\.com$]
2018-11-22 [^[a-z]{4,5}[.@][a-z]+[.@][a-z]+\.[a-z]{2,3}$,...
2018-11-23 [^frank@mydomain\.com$, ^[a-z]{4,5}[.@][a-z]+[...
2018-11-24 [^frank@mydomain\.com$, ^[a-z]+[.@_][a-z]+[.@]...
Name: email_grouped, dtype: object
We can look at the results in markdown for clarity.
date | email_grouped |
---|---|
2018-11-20 00:00:00 | ['^john\.doe@example\.com$'] |
2018-11-21 00:00:00 | ['^[a-z]{4}\.[a-z]{3,5}@example\.com$'] |
2018-11-22 00:00:00 | ['^[a-z]{4,5}[.@][a-z]+[.@][a-z]+\.[a-z]{2,3}$', '^bob\-johnson_123@example\.com$'] |
2018-11-23 00:00:00 | ['^frank@mydomain\.com$', '^[a-z]{4,5}[.@][a-z]+[.@][a-z]+\.[a-z]{2,3}$', '^bob\-johnson_123@example\.com$'] |
2018-11-24 00:00:00 | ['^frank@mydomain\.com$', '^[a-z]+[.@_][a-z]+[.@][a-z]+\.[a-z]{2,3}$', '^bob\-johnson_123@example\.com$'] |
Build Documentation Locally
cd /path/to/canomaly/docs
make html