scDenorm is an algorithm that reverts normalised single-cell omics data to raw counts, preserving the integrity of the original measurements and ensuring consistent data processing during integration.
pip install scDenorm
#or
conda install -c changebio scdenorm
numpy
pandas
matplotlib
scanpy
anndata
scipy
tqdm
pathlib
fastcore
colorlog
import scanpy as sc
from scipy.io import mmwrite
from scDenorm.denorm import *
DEBUG:my_logger:This is a debug message
INFO:my_logger:This is an info message
WARNING:my_logger:This is a warning message
ERROR:my_logger:This is an error message
CRITICAL:my_logger:This is a critical message
ad=sc.datasets.pbmc3k()
ad.layers['count']=ad.X.copy()
ad
AnnData object with n_obs × n_vars = 2700 × 32738
var: 'gene_ids'
layers: 'count'
sc.pp.normalize_total(ad, target_sum=1e4)
sc.pp.log1p(ad)
smtx = ad.X.tocsr().asfptype()
smtx.data
array([1.6352079, 1.6352079, 2.2258174, ..., 1.7980369, 1.7980369,
2.779648 ], dtype=float32)
ad.write_h5ad('data/pbmc3k_norm.h5ad')
write out as sparse matrix
mmwrite('data/scaled.mtx', smtx[1:10,])
scdenorm('data/pbmc3k_norm.h5ad',fout='data/pbmc3k_denorm.h5ad',verbose=1)
INFO:my_logger:Reading input file: data/pbmc3k_norm.h5ad
/home/huang_yin/anaconda3/envs/sc/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
warnings.warn(
INFO:my_logger:The dimensions of this data are (2700, 32738).
INFO:my_logger:Selecting base
INFO:my_logger:Denormlizing ...the base is 2.718281828459045
b is 2.718281828459045
100%|██████████| 2700/2700 [00:02<00:00, 1071.27it/s]
INFO:my_logger:Writing output file: data/pbmc3k_denorm.h5ad
return a new anndata if there is no output path.
new_ad=scdenorm('data/pbmc3k_norm.h5ad')
new_ad
View of AnnData object with n_obs × n_vars = 2700 × 32738
var: 'gene_ids'
uns: 'log1p'
ad.layers['count'].data
array([1., 1., 2., ..., 1., 1., 3.], dtype=float32)
new_ad.X.data
array([1. , 1. , 2.0000002, ..., 1. , 1. ,
3. ], dtype=float32)
If it is gene by cell, set gxc=True
.
scdenorm('data/scaled.mtx',fout='data/scd_scaled.h5ad')
100%|██████████| 9/9 [00:00<00:00, 2883.12it/s]
!scdenorm data/pbmc3k_norm.h5ad --fout data/pbmc3k_denorm.h5ad
/home/huang_yin/anaconda3/envs/sc/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
warnings.warn(
b is 2.718281828459045
100%|█████████████████████████████████████| 2700/2700 [00:02<00:00, 1090.85it/s]
!scdenorm data/scaled.mtx --fout data/scd_scaled_c.h5ad
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1333.31it/s]
or output mtx
format.
!scdenorm data/scaled.mtx --fout data/scd_scaled_c.mtx
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1290.78it/s]
Yin Huang, Anna Vathrakokili Pournara, Ying Ao, Lirong Yang, Hui Zhang, Yongjian Zhang, Sheng Liu, Alvis Brazma, Irene Papatheodorou, Xinlu Yang, Ming Shi, Zhichao Miao “scDenorm: a denormalisation tool for integrating single-cell transcriptomics data”(Under review)