psankey - A module for plotting Sankey flow diagrams in Python
Inspired by d3-sankey package for d3js (https://github.com/d3/d3-sankey)
Brief description
In data science, we often require to visualize flows in the form of a Sankey diagram. This module helps with that. Usage is very straightforward and customizable.
Note: Does not work for cyclical graphs.
Getting started
Installation
Directly from the source - clone this repo on local machine, open the shell, navigate to this directory and run:
python setup.py install
or through pip:
pip install psankey
Documentation
Input Data Format
A dataframe of links with the following columns (first 3 required, rest optional):
source: name of source node
target: name of target node
value: value (width or breadth) of the link
color: optional. color of the link
alpha: optional. alpha (opaqueness) of the link
Example:
Input:
data1.csv
source,target,value,color,alpha
B,E,20,,
C,E,20,,
C,D,20,,
B,A,20,,
E,D,20,,
E,A,20,,
D,A,40,orange,0.85
Usage
from psankey.sankey import sankey
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data/data1.csv')
mod = {'D': dict(facecolor='green', edgecolor='black', alpha=1, label='D1', yPush=1)}
nodes, fig, ax = sankey(df, aspect_ratio=4/3, nodelabels=True, linklabels=True, labelsize=5, nodecmap='copper', nodecolorby='level', nodealpha=0.5, nodeedgecolor='white', nodemodifier=mod)
plt.show()
Parameters
df: pandas dataframe. DataFrame with the links. Required columns: source, target, value. Optional columns: color, alpha.
aspect_ratio: float, default: 4/3. Aspect ratio of the figure.
nodelabels: boolean, default: True. Whether node labels should be plotted.
linklabels: boolean, default: True. Whether link labels should be plotted.
labelsize: int, default: 5. Font size of the labels.
nodecolorby: deafult:level. Possible values:['level'|'size'|'index'|dictionary mapping each node to a value|any color e.g. 'blue']
nodecmap: default: None. Colormap of the nodes, required if nodecolorby=['level'|'size'|'index']. To learn more: https://matplotlib.org/3.2.1/tutorials/colors/colormaps.html
nodealpha: float, default=0.5. Alpha of the nodes, between 0 (100% transparent) and 1 (0% transparent).
nodeedgecolor: default: 'white'. color of the border of the nodes.
nodemodifier: optional. To be used if a few nodes need to be formatted selectively, overriding the nodecmap, nodecolorby, nodealpha & nodeedgecolor paramaters. Parameters can be passed as a dictionary as shown in example.
Sample Output
Citing pSankey
To cite the library if you use it in scientific publications (or anywhere else, if you wish), please use the link to the GitHub repository (https://github.com/mandalsubhajit/pSankey). Thank you!