grimlock

Simple tool that assists with preprocessing pandas dataframes for Machine Learning.


License
MIT
Install
pip install grimlock==0.0.1

Documentation

Grimlock

We all know that when it comes to machine learning, it takes far more time to preprocess your data than it does to actually build a model. Enter, grimlock.

grimlock will fix your missing values, handle data encoding, and feature scaling.

Installation

Provided you already have NumPy, SciPy, Sci-kit Learn and Pandas already installed, the grimlock package is pip-installable:

$ pip install grimlock

Cleaning Missing Data

Mesh of pandas.fillna() and sklearn Imputer

from grimlock import clean_missing
clean_missing(dataframe, column, clean_type='zero')

Parameters

  • dataframe: dataframe variable
  • column: column name (string)
  • clean_type: 'zero' (default), 'mean', 'mode', 'most_frequent' (string)

Convert Categorical

Quick conversion for categorical features (non-ordinal)

from grimlock import convert_categorical
convert_categorical(dataframe, column, target_column)

Parameters

  • dataframe: dataframe variable
  • column: column name (string)
  • target_column: target column name (string)

Feature Scaling

coming soon