pspartition

a hierachical clustering algorithm based on information theory


License
Apache-2.0
Install
pip install pspartition==0.7.post5

Documentation

Windows Build Status codecov Documentation Status

Introduction

This repository contains source code to compute the principal sequence of partition for (un)directed graph. Currently, four methods are available:

  1. Traditional Canonical Method based on Dilworth truncation and Graph Maximal Flow (mac), dt in our library;
  2. Parametric computing scheme combined with Parametric Maximal Flow (pdt, our algorithm)
  3. Graph Contraction combined with rapid jump (psp_i, our algorithm, fastest among all)
  4. original parametric Dilworth truncation. (pin, called pdt_r in our library)

All method relies on LEMON Library to compute maximum flow for graph.

How to build

Dependencies

  • LEMON (required)
  • boost-program-options is required to build the executable program main
  • googletest (optional, used in unit-test)

You need to enable C++11 to compile the code.

Dependency Installation Guide

  • If you use CentOS/Fedora based system, you can install the pre-built binary package liblemon from copr.
  • If you use Ubuntu(<18.04), you can install the pre-built binary package liblemon from launchpad; liblemon library is included in Ubuntu from 18.04, see liblemon-dev.
  • On Windows platform, vcpkg is required to fix the dependencies.
  • On MacOS platform, brew is recommended to fix the dependencies.

CMake

This project uses CMake build system. We recommend out of source build. From project root directory,

mkdir build
cd build
cmake ..

If any error occurs, you should fix the dependencies first.

options

use -DEnable_Testing=ON to compile the test binary (requiring googletest).

Reference

  1. [2016] Info-Clustering: A Mathematical Theory for Data Clustering
  2. https://github.com/ktrmnm/SFM
  3. [2010] A Faster Algorithm for Computing the Principal Sequence of Partitions of a Graph
  4. [2017] Info-Clustering: An Efficient Algorithm by Network Information Flow

Contributing

See contributing.md

ChangeLog

  • Version 1.2: implement DT Class.
  • Version 1.3: implement PDT Class
  • Version 1.4: implement PSP_I Class.
  • Version 1.5: implement PDT_R Class.
  • Version 1.6: create uniform wrapper PSP Class of the above four methods