DeltaTopic

Packages to implement BALSAM and DeltaTopic as described in the paper: Unraveling dynamically-encoded latent transcriptomic patterns in pancreatic cancer cells by topic modelling


License
MIT
Install
pip install DeltaTopic==0.1.0

Documentation

PyPi Docs

DeltaTopic: Dynamically-Encoded Latent Transcriptomic pattern Analysis by Topic modeling

This is a project repository for our paper

Summary

Building a comprehensive topic model has become an important research tool in single-cell genomics. With a topic model, we can decompose and ascertain distinctive cell topics shared across multiple cells, and the gene programs implicated by each topic can later serve as a predictive model in translational studies. Here, we present a Bayesian topic model that can uncover short-term RNA velocity patterns from a plethora of spliced and unspliced single-cell RNA-seq counts. We showed that modelling both types of RNA counts can improve robustness in statistical estimation and reveal new aspects of dynamic changes that can be missed in static analysis. We showcase that our modelling framework can be used to identify statistically-significant dynamic gene programs in pancreatic cancer data. Our results discovered that seven dynamic gene programs (topics) are highly correlated with cancer prognosis and generally enrich immune cell types and pathways.

Installation

DeltaTopic requires Python 3.8 or later. We recommend to use Miniconda.

Install DeltaTopic from PyPI using:

pip install DeltaTopic

To work with the latest development version, install from GitHub using:

python3 -m pip install git+https://github.com/causalpathlab/DeltaTopic

Data

We obtained the original FASTQ files for pancreatic ductal adenocarcinoma (PDAC) from the public repository provided by two PDAC studies. The spliced and unspliced count matrices were quantified by kb-python.

kb count -i index.idx -g t2g.txt -x 10xv2 -o ${output} \
-c1 spliced_t2c.txt -c2 unspliced_t2c.txt \
--workflow lamanno --filter bustools \
${fastq1} ${fastq2}

Run

# train BALASM model on the spliced count data
BALSAM --nLV 32 --EPOCHS 100 
# train deltaTopic model
DeltaTopic --nLV 32 --EPOCHS 100 

For full documentaions, please refer to DeltaTopic Documentation