tSNE.js
tdistributed stochastic neighbor embedding (tSNE) algorithm implemented in JavaScript
Runs in the browser (also runs in Web Workers)
Runs in node.js
Uses efficient inplace matrix operations via ndarray
Follows closely the API of scikitlearn, allowing specification of perplexity and early exaggeration factor, among other parameters.
Background
tSNE is a powerful manifold technique for embedding data into lowdimensional space (typically 2d or 3d for visualization purposes) while preserving small pairwise distances or local data structures in the original highdimensional space. In practice, this results in a much more intuitive layout within the lowdimensional space as compared to other techniques. The lowdimensional embedding is learned by minimizing the KullbackLeibler divergence between the pairwisesimilarity probability distribution over the original data space and distribution over the embedding space.
An important note is that the objective function is nonconvex with numerous local minima, and thus the results are nondeterministic. There are a few model parameters which influence the learning and optimization process. Selecting appropriate parameters for the input data can significantly improve the chances the model converge on good solutions.
Currently implemented is the exact fomulation, which has computational complexity O(dN^2), where d is the original dimensionality of the data and N is the number of samples. Implementation of the O(dN*logN) BarnesHut approximation variant is planned (contributions welcome!).
Usage
Can be run in node.js or the browser. In the browser, should ideally be run in a web worker.
node.js
$ npm install tsnejs save
import TSNE from 'tsnejs';
let model = new TSNE({
dim: 2,
perplexity: 30.0,
earlyExaggeration: 4.0,
learningRate: 100.0,
nIter: 1000,
metric: 'euclidean'
});
// inputData is a nested array which can be converted into an ndarray
// alternatively, it can be an array of coordinates (second argument should be specified as 'sparse')
model.init({
data: inputData,
type: 'dense'
});
// `error`, `iter`: final error and iteration number
// note: computationheavy action happens here
let [error, iter] = model.run();
// rerun without recalculating pairwise distances, etc.
let [error, iter] = model.rerun();
// `output` is unpacked ndarray (regular nested javascript array)
let output = model.getOutput();
// `outputScaled` is `output` scaled to a range of [1, 1]
let outputScaled = model.getOutputScaled();
browser
<script src="tsne.min.js"></script>
Then it's the same API as above. A browser example using Web Workers is in the example/
folder.
Model Parameters
dim
: number of embedding dimensions, typically 2 or 3perplexity
: approximately related to number of nearest neighbors used during learning, typically between 5 and 50earlyExaggeration
: parameter which influences spacing between clusters, must be at least 1.0learningRate
: learning rate for gradient descent, typically between 100 and 1000nIter
: maximum number of iterations, should be at least 200
metric
: distance measure to use for input data, currently implemented measures include'euclidean'
'manhattan'

'jaccard'
(boolean data) 
'dice'
(boolean data)
You can also pass a distance function to
metric
import cwise from 'cwise'; // Operates on an ndimensional array using the cwise module let euclidean = cwise({ args: ['array', 'array'], pre: function(a, b) { this.sum = 0.0; }, body: function(a, b) { var d = a  b; this.sum += d * d; }, post: function(a, b) { return Math.sqrt(this.sum); } }); let model = new TSNE({ metric: euclidean });
Build
To run build yourself, for both the browser (outputs to build/tsne.min.js
) and node.js (outputs to dist/
):
$ npm run build
To build for just the browser, run npm run buildbrowser
, and to build for just node.js, run npm run buildnode
.
Tests
$ npm test
References
The original paper on tSNE:
L.J.P. van der Maaten and G.E. Hinton.
Visualizing HighDimensional Data Using tSNE.
Journal of Machine Learning Research 9(Nov):25792605, 2008.
Paper on BarnesHut variant tSNE:
L.J.P. van der Maaten.
Accelerating tSNE using TreeBased Algorithms.
Journal of Machine Learning Research 15(Oct):32213245, 2014.