Single-Cell ATAC-seq Analysis via Latent feature Extraciton

pip install scale-atac==1.1.0


Single-Cell ATAC-seq analysis via Latent feature Extraction


2021.01.14 Update to compatible with h5ad file and scanpy


SCALE neural network is implemented in Pytorch framework.
Running SCALE on CUDA is recommended if available.

install from GitHub

git clone git://
python install

Installation only requires a few minutes.

Quick Start


  • h5ad file
  • count matrix file:
    • row is peak and column is barcode, in txt / tsv (sep="\t") or csv (sep=",") format
  • mtx folder contains three files:
    • count file: count in mtx format, filename contains key word "count" / "matrix"
    • peak file: 1-column of peaks chr_start_end, filename contains key word "peak"
    • barcode file: 1-column of barcodes, filename contains key word "barcode"

Run -d [input]

if cluster number k is known: -d [input] -k [k]


Output will be saved in the output folder including:

  • saved model to reproduce results cooperated with option --pretrain
  • adata.h5ad: saved data including Leiden cluster assignment, latent feature matrix and UMAP results.
  • umap.pdf: visualization of 2d UMAP embeddings of each cell


Get binary imputed data in folder binary_imputed with option --binary (recommended for saving storage) -d [input] --binary  

or get numerical imputed data in file imputed_data.txt with option --impute -d [input] --impute

Useful options

  • save results in a specific folder: [-o] or [--outdir]
  • embed feature by tSNE or UMAP: [--embed] tSNE/UMAP
  • filter low quality cells by valid peaks number, default 100: [--min_peaks]
  • filter low quality peaks by valid cells number, default 10: [--min_cells]
  • modify the initial learning rate, default is 0.002: [--lr]
  • change iterations by watching the convergence of loss, default is 30000: [-i] or [--max_iter]
  • change random seed for parameter initialization, default is 18: [--seed]
  • binarize the imputation values: [--binary]


Look for more usage of SCALE --help 

Use functions in SCALE packages.

import scale
from scale import *
from scale.plot import *
from scale.utils import *

Running time


Tutorial Forebrain Run SCALE on dense matrix Forebrain dataset (k=8, 2088 cells)

Tutorial Mouse Atlas Run SCALE on sparse matrix Mouse Atlas dataset (k=30, ~80,000 cells)

Data availability


Lei Xiong, Kui Xu, Kang Tian, Yanqiu Shao, Lei Tang, Ge Gao, Michael Zhang, Tao Jiang & Qiangfeng Cliff Zhang. SCALE method for single-cell ATAC-seq analysis via latent feature extraction. Nature Communications, (2019).