mst_clustering

Clustering with Minimum Spanning Trees


License
BSD-3-Clause
Install
pip install mst_clustering==1.0

Documentation

Minimum Spanning Tree Clustering

build status version status license DOI JOSS

This package implements a simple scikit-learn style estimator for clustering with a minimum spanning tree.

Motivation

Automated clustering can be an important means of identifying structure in data, but many of the more popular clustering algorithms do not perform well in the presence of background noise. The clustering algorithm implemented here, based on a trimmed Euclidean Minimum Spanning Tree, can be useful in this case.

Example

The API of the mst_clustering code is designed for compatibility with the scikit-learn project.

from mst_clustering import MSTClustering
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

# create some data with four clusters
X, y = make_blobs(200, centers=4, random_state=42)

# predict the labels with the MST algorithm
model = MSTClustering(cutoff_scale=2)
labels = model.fit_predict(X)

# plot the results
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='rainbow');

Simple Clustering Plot

For a detailed explanation of the algorithm and a more interesting example of it in action, see the MST Clustering Notebook.

Installation & Requirements

The mst_clustering package itself is fairly lightweight. It is tested on Python 2.7 and 3.4-3.5, and depends on the following packages:

Using the cross-platform conda package manager, these requirements can be installed as follows:

$ conda install numpy scipy scikit-learn

Finally, the current release of mst_clustering can be installed using pip:

$ conda install pip  # if using conda
$ pip install mst_clustering

To install mst_clustering from source, first download the source repository and then run

$ python setup.py install

Contributing & Reporting Issues

Bug reports, questions, suggestions, and contributions are welcome. For these, please make use the Issues or Pull Requests associated with this repository.

Citing

If you use this code in an academic publication, please consider citing this JOSS Paper.