SELFISH (Discovery of Differential Chromatin Interactions via a Self-Similarity Measure) is a tool by Abbas Roayaei Ardakany, Ferhat Ay (firstname.lastname@example.org), and Stefano Lonardi. This Python implementation is currently maintained by Tuvan Gezer (email@example.com). The original MATLAB implementation by Abbas (firstname.lastname@example.org) can be found at https://github.com/ucrbioinfo/SELFISH.
SELFISH is a tool for finding differential chromatin interactions between two Hi-C contact maps. It uses self-similarity to model interactions in a robust way. For more information read the full paper: Selfish: discovery of differential chromatin interactions via a self-similarity measure.
Installation and usage
pip3 install selfish-hic selfish -f1 /path/to/contact/map1.txt \ -f2 /path/to/contact/map2.txt \ -ch 2 \ -r 100kb -o ./output.npy
Make sure you have Python 3 installed, along with all the dependencies listed.
git clone https://github.com/ay-lab/selfish ./selfish/selfish/selfish.py -f1 /path/to/contact/map1.txt \ -f2 /path/to/contact/map2.txt \ -ch 2 \ -r 100kb -o ./output.npy
If you have any problem regarding dependencies or version mismatches, we recommend using Nextflow with a container technology like Docker or Singularity. These methods require Nextflow(Can be installed with a single command that doesn't require special permissions.), and the desired container technology to be available.
Program arguments are given to Nextflow with two dashes and the short format listed below.
Updating: If Nextflow warns that your project is outdated, use
nextflow pull ay-lab/selfish in order to update to latest version.
Nextflow works for Linux and OS X. Install it using one of the commands listed below. Requires Java 8+
wget -qO- https://get.nextflow.io | bash OR curl -s https://get.nextflow.io | bash
./nextflow run ay-lab/selfish --f1="/path/to/contact/map1.txt" \ --f2="/path/to/contact/map2.txt" \ --ch=2 \ --r=100kb -profile docker
./nextflow run ay-lab/selfish --f1="/.../map1.txt" \ --f2="/.../map2.txt" \ --ch=2 \ --r=100kb -profile singularity
Bioconda install isn't currently available.
Selfish uses some python packages to accomplish its mission. These are the packages used by selfish:
|-f1||--file1||Location of contact map 1. (See below for format.) Not required for HiC-Pro input.|
|-f2||--file2||Location of contact map 2. (See below for format.) Not required for HiC-Pro input.|
|-r||--resolution||Resolution of the provided contact maps.|
|-o||--outfile||Name of the output file.|
|-ch||--chromosome||Specify which chromosome to run the program for.|
|-m1||--matrix1||Location of matrix file, only for HiC-Pro type input.|
|-m2||--matrix2||Location of matrix file, only for HiC-Pro type input.|
|-bed1||--bed1||Location of bed file, only for HiC-Pro type input.|
|-bed2||--bed2||Location of bed file, only for HiC-Pro type input.|
|-t||--tsvout||If specified, outputs will be written as a TSV file. Specify the p-value threshold for which the results will be written to the file (i.e. -t 0.05)|
|-c||--changes||Name of the output file that has the log fold changes between the inputs.|
|-b1||--biases1||Location of bias/normalization vector file for contact map 1. (See below for format.)|
|-b2||--biases2||Location of bias/normalization vector file for contact map 2. (See below for format.)|
|-sz||--sigmaZero||Sigma0 parameter for Selfish. Default is experimentally chosen for 5Kb resolution.|
|-i||--iterations||Iteration count parameter for Selfish. Default is experimentally chosen for 5Kb resolution.|
|-v||--verbose||Whether the program prints its progress while running. Default is True.|
|-p||--plot||Whether the program plots its results. Highly discouraged for high resolutions(<50kb) as it will take a lot of time to compute the plots. For high resolutions, we recommend using the output matrix and plotting small sections of it manually. Default is False.|
|-lm||--lowmem||Use float32 instead of float64. Uses less memory at the cost of precision. Default is False.|
|-V||--version||Shows the version of the tool.|
SELFISH supports 3 different input formats. Plain text, .hic, .cool, .bed/.matrix pairs (HiC-Pro format).
Text Contact Maps
Contact maps need to have the following format. They must not have a header. Values must be separated by either a space, a tab, or a comma.
|Chromosome||Midpoint 1||Chromosome||Midpoint 2||Contact Count|
Bed-Matrix pairs (HiC-Pro format)
User must provide a chromosome with the -ch argument. .bed and .matrix files must have the same name other than the extension. Either file name can be provided as an input for selfish and the program will search for the second file automatically.
User must provide a chromosome with the -ch argument. Selfish uses juicer's straw tool to read .hic files.
User must provide a chromosome with the -ch argument. Selfish uses cooler to read .cool files.
Bias (normalization) File
Bias file need to have the following format. Bias file must use the same midpoint format as the contact maps. Bias file must not have a header.
Output of Selfish is a matrix of p-values indicating the probability of differential conformation (Smaller values mean more significant.).
X and Y coordinates indicate the bin midpoints.
Another optional output is the log fold changes file. It is simply produced by
log2((map1 + 1) / (map2 + 1))
File format of the outputs is a binary numpy file. It can be read by using Numpy as follows.
import numpy as np matrix = np.load("/path/to/output/selfish.npy")
If you use Selfish in your work, please cite our Bionformatics paper:
Abbas Roayaei Ardakany, Ferhat Ay, Stefano Lonardi, Selfish: discovery of differential chromatin interactions via a self-similarity measure, Bioinformatics, Volume 35, Issue 14, July 2019, Pages i145–i153, https://doi.org/10.1093/bioinformatics/btz362