mudatasets

Multimodal Datasets in MuData format


Keywords
anndata, mudata, multi-omics, multimodal-omics, multiome, muon, scanpy, scrna-seq
License
BSD-3-Clause
Install
pip install mudatasets==0.0.1

Documentation

Multimodal Datasets

mudatasets provides some public datasets with multimodal data, primarily focusing on multimodal omics datasets.

MuData library | MuData documentation

Installation

PyPi version

# Stable, with muon
pip install "mudatasets[muon]"
# Dev
pip install git+https://github.com/gtca/mudatasets

Getting started

import mudatasets as mds

Find available datasets

mds.list_datasets()

Load a dataset

mdata = mds.load("pbmc3k_multiome")
print(mdata)

Some common attributes for .load() are:

  • data_dir= for location to save the dataset (~/mudatasets/ by default)
  • with_info=True for also returning the second argument with dataset description as a dictionary (False by default)
  • backed=True for reading data in a backed format, only for .h5mu and .h5ad files (True by default)
  • files= for downloading specific files from the dataset
  • full=True for downloading all the files defined for the dataset (False by default)

Get dataset info

mds.info("pbmc3k_multiome")

List dataset file names

mds.list_files("pbmc3k_multiome")

Webpage with all the files

mds.serve_webpage(port=8000)

This command will launch a server providing a simple (temporarily created) HTML page at http://localhost:8000 with files across all of the datasets listed.