scRADO

doublet detection algorithm for droplet-based single-cell sequencing data


Keywords
bioinformatics, computational-biology, doublets, single-cell
License
MIT
Install
pip install scRADO==1.3

Documentation

RADO: Robust and Accurate DOublet detection

scRADOPyPI - DownloadsGitHub

Figure1

Installation

conda env create -f RADO_env.yml
conda activate RADO_env
# optional: conda create -n RADO_env python=3.7
# pip install umap-learn==0.5.3 (to be compatible with python3.7)
# pip install scRADO==1.2

Usage

For scRNA-seq data

from RADO import DoubletDetection
# adata (.H5AD file) is commmon data form in single-cell data analysis
adata = DoubletDetection(adata)
# filter out doublet
adata = adata[adata.obs['RADO_doublet_call']==0,]

For scATAC-seq data

from RADO import DoubletDetection
# Assume the adata.X is the peak matrix
adata = DoubletDetection(adata, atac_data=True)
# filter out doublet
adata = adata[adata.obs['RADO_doublet_call']==0,]

It will return an adata with predicted doublet score and doublet for each droplet in the dataset. The prediction can be found in adata.obs['RADO_doublet_score'] and adata.obs['RADO_doublet_call']. For doublet calling, 0 represents singlet and 1 represents doublet.

Performance

18 scRNA-seq datasets

Figure2

2 scATAC-seq datasets

Figure2

Datasets

The 16 scRNA-seq datasets were collected from the benchmarking paper of Xi and Li. Datasets were transformed into H5AD format using sceasy. Processing script is convertH5AD.R.

The 2 DOGMA-seq datasets are from Xu et al. Datasets with original singlet or doublet annotation need to be requested from the original authors.