sisua

SemI-SUpervised generative Autoencoder for single cell data


Keywords
sisua
License
MIT
Install
pip install sisua==0.4.4

Documentation

SISUA

SISUA_design

Semi-supervised Single-cell modeling:

Reference:

  • Trung Ngo Trong, Roger Kramer, Juha Mehtonen, Gerardo González, Ville Hautamäki, Merja Heinäniemi. "SISUA: SemI-SUpervised Generative Autoencoder for Single Cell Data", ICML Workshop on Computational Biology, 2019. [pdf]

Installation

You only need Python 3.6, the stable version of SISUA installed via pip:

pip install sisua

Install the nightly version on github:

pip install git+https://github.com/trungnt13/sisua@master

For developers, we create a conda environment for SISUA contribution sisua_env

conda env create -f=sisua_env.yml

Getting started

  1. The basics:
  2. Single-cell analysis:
    • Latent space
    • Imputation of genes expression
    • Prediction of protein markers
  3. Advanced technical topics:
    • Probabilistic embedding
    • Hierarchical modeling (coming soon)
    • Causal analysis (coming soon)
    • Cross datasets analysis (coming soon)
  4. Benchmarks:
  5. Further development:

Toolkits

We provide binary toolkits for fast and efficient analyzing single-cell datasets:

  • sisua-train: train single-cell modeling algorithms, support training multiple systems in parallel.
  • sisua-analyze: evaluate, compare, and interpret trained model.
  • sisua-embed: probabilistic embedding for semi-supervised training.
  • sisua-data: coming soon

Some important arguments:

-model

name of function declared in models

  • scvi: single-cell Variational Inference model
  • dca: Deep Count Autoencoder
  • vae: single-cell Variational Autoencoder
  • movae: SISUA
-ds

name of dataset declared in data.

Description of all predefined datasets is in docs.

Some good datasets for practicing:

  • pbmc8k_ly
  • cortex
  • pbmcecc_ly
  • pbmcscvi
  • pbmcscvae

Configuration

By default, the data will be saved at your home folder at ~/bio_data, and the experiments' outputs will be stored at ~/bio_log

You can customize these two paths using the environment variables:

  • For storing downloaded and preprocessed data: SISUA_DATA
  • For the experiments: SISUA_EXP

For example:

import os
os.environ['SISUA_DATA'] = '/tmp/bio_data'
os.environ['SISUA_EXP'] = '/tmp/bio_log'

from sisua.data import EXP_DIR, DATA_DIR

print(DATA_DIR) # /tmp/bio_data
print(EXP_DIR)  # /tmp/bio_log

or you could set the variables in advance:

export SISUA_DATA=/tmp/bio_data
export SISUA_EXP=/tmp/bio_log
python sisua/train.py
# or using the provided toolkit: sisua-train