StreamSampler

Data sampler from streaming data


Keywords
Reservior, sampling
License
MIT
Install
pip install StreamSampler==0.1.1

Documentation

StreamSampler

StreamSampler package allows you to sample a particular number of elements from a stream of data of which length is very large or unknown.

StreamSampler is provided in both forms of an executable command and library. It utilizes Reservoir sampling algorithm [Vitter85]

You can take a look at the README.txt of other projects, such as repoze.bfg (http://bfg.repoze.org/trac/browser/trunk/README.txt) for some ideas.

License

MIT License

See Also

  • sample-cli by Paul Butler is a command line tool providing almost the same feature. StreamSampler is intended to be a library, although it has a command line interface, so that it can be a part of other packages including my future projects.