A PyTorch implementation on Non-negative Matrix Factorization.
This package is published on PyPI:
pip install nmf-torch
Given a non-negative numeric matrix
X of shape M-by-N (M is number of samples, N number of features) in either numpy array or torch tensor structure, run the following code:
from nmf import run_nmf H, W, err = run_nmf(X, n_components=20)
X into two new non-negative matrices:
Hof shape (M, 20), representing the transformed coordinates of samples regarding the 20 components;
Wof shape (20, N), representing the composition of each component in terms of features;
along with the loss between
X and its approximation
run_nmf function uses the batch HALS solver for NMF decomposition. In total, there are other solvers available in NMF-torch:
- HALS: Hierarchical Alternative Least Squares ([Kimura et al., 2015]). The default.
MU: Multiplicative Update. Set
BPP: Alternative non-negative least squares with Block Principal Pivoting method ([Kim & Park, 2011]). Set
Besides, each solver has two modes: batch and online.
The online mode is a modified version which is scalable for input matrix of a large number of samples.
You can set
run_nmf function to switch to use the online mode.
The default beta loss is Frobenius (L2) distance, which is the most commonly used.
beta_loss parameter in
users can specify other beta loss metrics:
- Or any non-negative float number.
Notice that since online mode only works for L2 loss, if you specify other beta loss,
run_nmf will automatically switch back to batch mode.
For the other parameters in
run_nmf function, please type
help(run_nmf) in your Python interpreter to view.
Data Integration using Integrative NMF (iNMF)
In this case, we have a list of
k batches, with their corresponding non-negative numeric matrices to be integrated.
X be such a list, and all matrices in
X have the same number of features,
i.e. each Xi in
X has shape (Mi, N), where Mi is number of samples in batch i, and N is number of features.
The following code:
from nmf import integrative_nmf H, W, V, err = integrative_nmf(X, n_components=20)
will perform iNMF, which results in the following non-negative matrices:
H: List of matrices of shape (Mi, 20), each of which represents the transformed coordinates of samples regarding components of the corresponding batch;
Wof shape (20, N), representing the common composition (shared information) across the given batches in terms of features;
V: List of matrices of the same shape (20, N), each of which represents the batch-specific composition in terms of features of the corresponding batch,
along with the overall L2 loss between Xi and its approximation Hi * (W + Vi) for each batch i.
Similarly as in
run_nmf function above,
integrative_nmf provides 2 modes (batch and online) and 3 solvers: HALS, MU, and BPP.
By default, batch HALS is used. You can switch to other solvers and modes by specifying
There is another important parameter
lam for the coefficient for regularization terms, with default value
If set to
0, then no regularization will be applied.
Notice that only L2 loss is accepted in iNMF.
For the other parameters in
integrative_nmf function, please type
help(integrative_nmf) in your Python interpreter to view.