preprocessy

Data Preprocessing framework that provides customizable pipelines.


Keywords
Data, Pipelines, Preprocessing, Science, data-engineering, data-preprocessing-pipelines, data-science, hacktoberfest, hacktoberfest2022, machine-learning, python-library, under-construction
License
MIT
Install
pip install preprocessy==1.0.4

Documentation

Preprocessy

Data Preprocessing library that provides customizable pipelines.

Setup

  • Clone the repo and install dependencies in a venv. requirements_dev.txt would automatically install requirements.txt
    $ pip install -r requirements_dev.txt
  • Create a folder called datasets in the root directory. Its content can be found here

  • All code goes inside preprocessy. All test scripts go inside tests. All evaluation scripts go in evaluations

Steps before committing

  • Run tests from root directory.
    $ pytest -v -s
  • Run linter from root directory.
    $ pylint *.py
  • Run code formatter and spell checker from root directory
    $ black . && codespell --skip=".git,*.gif,*.png,*.PNG,./venv,*.json,./datasets,./.DS_Store,./tests/__pycache__"