Simple package for generating ngrams and bag of words representation from text.

nlp, text, ngram, ngrams
pip install text2math==0.0.8.dev1


A simple package designed to be used for demonstrating basic Natural Language Processing (NLP) feature engineering in Python.

Practice Dataset

Stack Exchange Data Dump

Text Encoding

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky


  • chardet - Universal encoding detector for Python 2 and 3
  • cchardet - Universal encoding detector. This library is faster than chardet
  • ftfy - fixes text for you
  • unidecode - ASCII transliterations of Unicode text

Natural Language Processing

Care and Feeding of Topic Models: Problems, Diagnostics, and Improvementes

Functional Programing in Python

Functional programming in Python Examine the functional aspects of Python: which options work well and which ones you should avoid By David Mertz


  • toolz - Toolz provides a set of utility functions for iterators, functions, and dictionaries.
  • functools - Higher-order functions and operations on callable objects.
  • itertools - Functions creating iterators for efficient looping.