text-generator

Fast text stream generator. You can set how many different elements will have the stream, how many elements will have the stream and which probability will have each element


Keywords
streams, text, data, generator
License
MIT
Install
pip install text-generator==0.1.0

Documentation

Text Generator

Motivation

Sometimes we need to create random datasets to check our algorithms. An exemple could be when you are checking a cardinality estimator algorithm. You need datasets with n distinct elements and N total elements. With text_generator you can generate these datasets choosing n N and even if the elements need to follow some distribution (uniform, exponential, zipfs...)

Code Example

text_generator could be used as command line tool which output by stdout

text_generator 10 100 uniform

or as python module:

from text_generator import TextGenerator

t_gen = TextGenerator(distribution='uniform')
t_gen.generate_dictionary(10)

for elem in t_gen.generate_stream(100):
    print(elem)

The API provide you options for create dictionaries and generate the text as streams (using python generators)

Installation

You can install the library through pip:

pip install text_generator

The library has numpy as dependencies.

License

text_generator is under the MIT LICENSE license. See the LICENSE file for details.