An open-source Python library for text cleaning tasks.


Keywords
ai, cleaner, datasets, nlp, profanity-detection, profanity-filter, python, removal, sensitive-data, sensitive-data-detection, text-cleaning
License
MIT
Install
pip install valx==0.1.9

Documentation

ValX

Python Version Code Size Downloads License Compliance PyPI Version

An open-source Python library for data cleaning tasks. Includes profanity detection, and removal. Also now includes personal information detection and removal. Now includes hate speech and offensive language detection using AI.

Installation

You can install ValX using pip:

pip install valx

Supported Python Versions

ValX supports the following Python versions:

  • Python 3.6
  • Python 3.7
  • Python 3.8
  • Python 3.9
  • Python 3.10
  • Python 3.11/Later (Preferred)

Please ensure that you have one of these Python versions installed before using ValX. ValX may not work as expected on lower versions of Python than the supported.

Features

  • Profanity Detection: Detect profane and NSFW words or terms.
  • Remove Profanity: Remove profane and NSFW words or terms.
  • Detect Sensitive Information: Detect sensitive information in text data.
  • Remove Sensitive Information: Remove sensitive information from text data.
  • Detect Hate Speech: Detect hate speech or offensive speech in text, using AI.

Usage

Detect Profanity

from valx import detect_profanity

# Detect profanity
num_profanities = detect_profanity(sample_text, language='English')

Remove Profanity

from valx import remove_profanity

# Remove profanity
removed = remove_profanity(sample_text, "text_cleaned.txt", language="English")

Detect Sensitive Information

from valx import detect_sensitive_information

# Detect sensitive information
detected_sensitive_info = detect_sensitive_information(sample_text)

Remove Sensitive Information

from valx import remove_sensitive_information

# Remove sensitive information
cleaned_text = remove_sensitive_information(sample_text2)

Detect Hate Speech And Offensive Language

from valx import detect_hate_speech

# Detect hate speech or offensive language
outcome_of_detection = detect_hate_speech("You are stupid.")

Contributing

Contributions are welcome! If you encounter any issues, have suggestions, or want to contribute to ValX, please open an issue or submit a pull request on GitHub.

License

ValX is released under the terms of the MIT License (Modified). Please see the LICENSE file for the full text.

Derived licenses


ValX uses data from this GitHub repository: https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/ © 2012-2020 Shutterstock, Inc.

Creative Commons Attribution 4.0 International License: https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/blob/master/LICENSE


Modified License Clause

The modified license clause grants users the permission to make derivative works based on the ValX software. However, it requires any substantial changes to the software to be clearly distinguished from the original work and distributed under a different name.