Text Generator
Motivation
Sometimes we need to create random datasets to check our algorithms. An exemple could be when you are checking a cardinality estimator algorithm. You need datasets with n distinct elements and N total elements. With text_generator you can generate these datasets choosing n N and even if the elements need to follow some distribution (uniform, exponential, zipfs...)
Code Example
text_generator could be used as command line tool which output by stdout
text_generator 10 100 uniform
or as python module:
from text_generator import TextGenerator
t_gen = TextGenerator(distribution='uniform')
t_gen.generate_dictionary(10)
for elem in t_gen.generate_stream(100):
print(elem)
The API provide you options for create dictionaries and generate the text as streams (using python generators)
Installation
You can install the library through pip:
pip install text_generator
The library has numpy as dependencies.
License
text_generator is under the MIT LICENSE license. See the LICENSE file for details.