multiqc-dumpling

MultiQC plugin for Dumpling DMS pipeline


Keywords
bioinformatics
License
MIT
Install
pip install multiqc-dumpling==0.2.1

Documentation

MultiQC_dumpling Module

Description

This is a plugin for MultiQC for conducting analysis and quality control of deep mutational scanning libraries with DIMPLE and dumpling. This plugin specifically parses baseline library sequencing runs to check for successful library generation.

Metrics and plots

This plugin generates several QC metrics and plots for baseline library sequencing runs:

Metrics

  • Mean counts: The mean and median number of times each variant is observed in a sequencing run.
  • Number of zero counts: The number of variants that were not observed in a sequencing run.
  • Fraction of zero counts: The fraction of variants that were not observed in a sequencing run (0-1).
  • Mean and median sequencing depth: The mean and median number of times each variant at a position was sampled in a sequencing run (i.e., the coverage).
  • Maximum and minimum sequencing depth: The maximum and minimum number of times each variant at a position was sampled in a sequencing run (i.e., the coverage).

Plots

  • Variant counts: A histogram of the number of times each variant is observed in a sequencing run.
  • Coverage: A histogram of the number of times each variant at a position was sampled in a sequencing run (i.e., the read depth).

Other metrics and plugins that are generated by the dumpling pipeline are also included in the MultiQC report.

Installation

To install the MultiQC_dumpling module, use pip:

pip install multiqc_dumpling

Usage

This module is intended for use as part of the dumpling pipeline.

To run it manually, some additional files are necessary.

First, provide a multiqc_config.yaml file with the following contents:

-orf: the nucleotide coordinates of the orf in the reference sequence in the form "start-end". If you are not doing mapping, just provide the length of the orf as "1-length". -variants_file: a csv containing the designed or expected variants in the library. The dumpling/DIMPLE workflows automatically generate this file. If you are providing your own, this should have at least a "pos" column with the amino-acid position of the variant and a "hgvs" column with the variant in HGVS format.

License

This project is licensed under the MIT License.

Contact

For any questions or suggestions, please open an issue on the GitHub repository.