cleanser

Tools for cleaning text.


Keywords
cleansing, data
License
MIT
Install
pip install cleanser==0.2.3

Documentation

cleanser

PyPI version Azure Pipelines Black Codecov

Utilities for cleaning text for NLP and other workflows.

Installation

pip install cleanser

Usage

from cleanser import Cleanser

text = """Hello World....

😺😺 Python is πŸ‘ŒπŸ˜€πŸ˜€ awesome  
"""

Cleanser(text).emoji().double_punctuation().whitespaces().text
>>> "Hello World. Python is awesome"

Contributing

Setup

  1. Install Poetry
  2. Run make setup to prepare workspace

Testing

  1. Run make test to run all tests

Linting and Formatting

  1. Run make format to run black code formatter
  2. Run make lint to run pylint
  3. Run make mypy to run mypy