sparseklearn

Sparsified algorithms for machine learning.


License
MIT
Install
pip install sparseklearn==0.1.4

Documentation

Sparseklearn

Dimensionality reduction for machine learning.

Overview

Sparseklearn is a Python package of machine learning algorithms based on dimensionality reduction. By working on compressed data, Sparseklearn performs standard machine learning tasks more efficiently and uses less memory. Its algorithms are all single-pass, meaning that they only need to access the raw data once, and are applicable to streaming data.

A note on optimization

Sparseklearn is being developed as a proof-of-concept for our work in statistical learning and compressed sensing. It is currently in prototype stage and has not yet been optimized. In particular,a lot of the computational demands in Sparseklearn have been pushed to a preconditioning step, currently a discrete cosine transform. This operation is fast and parallelizable, but it can still be a bottleneck. We use scipy's dct function, but you may want to precompute in a more efficient and distributed fashion. Sparseklearn can work with preconditioned data - it does not need access to the orginal raw data.