psankey

Package for plotting Sankey diagrams with Python


License
MIT
Install
pip install psankey==1.0.1

Documentation

psankey - A module for plotting Sankey flow diagrams in Python

Inspired by d3-sankey package for d3js (https://github.com/d3/d3-sankey)

Brief description

In data science, we often require to visualize flows in the form of a Sankey diagram. This module helps with that. Usage is very straightforward and customizable.

Note: Does not work for cyclical graphs.

Getting started

Installation

Directly from the source - clone this repo on local machine, open the shell, navigate to this directory and run:

python setup.py install

or through pip:

pip install psankey

Documentation

Input Data Format

A dataframe of links with the following columns (first 3 required, rest optional):

source: name of source node
target: name of target node
value: value (width or breadth) of the link
color: optional. color of the link
alpha: optional. alpha (opaqueness) of the link

Example:

Input:

data1.csv

source,target,value,color,alpha
B,E,20,,
C,E,20,,
C,D,20,,
B,A,20,,
E,D,20,,
E,A,20,,
D,A,40,orange,0.85

Output:

Usage

from psankey.sankey import sankey
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('data/data1.csv')
mod = {'D': dict(facecolor='green', edgecolor='black', alpha=1, label='D1', yPush=1)}
nodes, fig, ax = sankey(df, aspect_ratio=4/3, nodelabels=True, linklabels=True, labelsize=5, nodecmap='copper', nodecolorby='level', nodealpha=0.5, nodeedgecolor='white', nodemodifier=mod)
plt.show()

Parameters

df: pandas dataframe. DataFrame with the links. Required columns: source, target, value. Optional columns: color, alpha.

aspect_ratio: float, default: 4/3. Aspect ratio of the figure.

nodelabels: boolean, default: True. Whether node labels should be plotted.

linklabels: boolean, default: True. Whether link labels should be plotted.

labelsize: int, default: 5. Font size of the labels.

nodecolorby: deafult:level. Possible values:['level'|'size'|'index'|dictionary mapping each node to a value|any color e.g. 'blue']

nodecmap: default: None. Colormap of the nodes, required if nodecolorby=['level'|'size'|'index']. To learn more: https://matplotlib.org/3.2.1/tutorials/colors/colormaps.html

nodealpha: float, default=0.5. Alpha of the nodes, between 0 (100% transparent) and 1 (0% transparent).

nodeedgecolor: default: 'white'. color of the border of the nodes.

nodemodifier: optional. To be used if a few nodes need to be formatted selectively, overriding the nodecmap, nodecolorby, nodealpha & nodeedgecolor paramaters. Parameters can be passed as a dictionary as shown in example.

Sample Output

Citing pSankey

To cite the library if you use it in scientific publications (or anywhere else, if you wish), please use the link to the GitHub repository (https://github.com/mandalsubhajit/pSankey). Thank you!