tasmanian-mismatch

Tasmanian tool to analyze mismatches at read and position in high throughput sequencing data


License
GPL-3.0
Install
pip install tasmanian-mismatch==1.0.7

Documentation

CircleCI Bioconda License Language

Image of Tasmanian Devil

Tasmanian

A tool for the analysis of reference mismatches in high throughput sequencing data from DNA samples. Unlike other tools, it is able to evalutate the portions of reads that overlap with specified regions (e.g. Repeats)

Goals

The main goal is to identify systematic missmatches that might confound SNPs or other variations that should or should not be associated to biological outcomes. Since we noticed a set of regions, which might not necessarily be missplaced in the reference genome, have dramatic effects in this analysis, we provide a way of spliting these reads and incorporate the information in different tables, so that intersecting/non intersecting reads are not filtered out. Also, the researcher has a more accurate picture of the influence of these regions in the observed artifacts.

Overview of Tasmanian use:

samtools view bam | run_intersections [OPTIONS] | run_tasmanian [OPTIONS]
  1. Classification of each base of the read into overlapping (in which case could be contained or boundary - see figure below) or Non-overlapping with regions of interest included in a bed/bedgraph file.
  2. Positional analysis of artifacts splitted by read 1 and read 2.

drawing


The output includes tables to manupulate and plot the data and a built in report for fast access the data (see figure below).

drawing

  • Easy to use command-line and nextflow implementation.
  • Includes a Galaxy wrapper

Contributing

Contributions are welcome and encouraged.

Authors

License

tasmanian artifact metrics tool is open source software released under the GNU License.