sparsemf

A matrix factorization recommender which runs on top of NumPy and SciPy. Developed with a focus on speed, and highly sparse matrices.


Keywords
imputation, algorithm, matrix-factorization, recommender, soft-thresholding, sparse, sparse-matrix
License
Other
Install
pip install sparsemf==0.11

Documentation

sparseMF

SparseMF is a matrix factorization recommender written in Python, which runs on top of NumPy and SciPy. It was developed with a focus on speed, and highly sparse matrices.

Use SparseMF if you need a recommender that:

  • Runs quickly using explicit recommender data
  • Supports scipy sparse matrix formats
  • Retains the sparsity of your data during training

Algorithm

This repo introduces two sparse matrix factorization algorithms. The algorithms were originally introduced by Trevor Hastie et al. in a 2014 paper "Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares" as an extention to SoftImpute, which was introduced in 2009. A sparse implementation of each of these algorithms is introduced here. Both borrow from the FancyImpute python dense implementation of the 2009 SoftImpute algorithm. With large, sparse matrices, this version is significantly faster at predicting ratings for user/item pairs. To learn more about the differences between the two algorithms, read Trevor Hastie's vignette.

Getting Started

SparseMF is simple to use. Choose the algorithm you would like to import, SoftImpute or SoftImputeALS and use it as follows:

from softimpute import SoftImpute

model = SoftImpute()
X = my_data
model.fit(X)
model.predict( [users], [items] )

Relative Speed

Here is how the speed of SparseMF stacks up against GraphLab and FancyImpute:

Other Package Contents

In addition to these algorithms, the package also includes:

Resources

Here are some helpful resources:

  1. A Helpful Introduction to Matrix Factorization Recommenders.
  2. Benchmarks for MovieLens Dataset.
  3. Trevor Hastie's Hybrid Implementation of Soft-Impute and ALS.