('Tools to filter sam o bam files by percent identity or percent of matched sequence',)


Keywords
alignment, bioinformatics, computational-biology, genomics, python, samtools, sequence-alignment
License
CC-BY-4.0
Install
pip install filtersam==0.0.11

Documentation

logo

A Python tool to filter sam/bam files by percent identity or percent of matched sequence

PyPI GitHub release (latest by date) GitHub license Contributor Covenant DOI


Percent identity is computed as:

$$PI = 100 \frac{N_m}{N_m + N_i}$$

where $N_m$ is the number of matches and $N_i$ is the number of mismatches.

Percent of matched sequences is computed as:

$$PM = 100 \frac{N_m}{L}$$

where $L$ corresponds to query sequence length.

NOTE

BAM/SAM files must contain MD tags to be able to filter by percent identity. Aligners such as BWA add MD tags to each queried sequence in a BAM file. MD tags can also be generated with samtools.

Installation

pip install filtersam

Usage

You can find a jupyter notebook with usage examples here.

Citation

If you use this software, please cite it as below:

Robaina-Estévez, S. (2022). filterSAM: filter sam/bam files by percent identity or percent of matched sequence (Version 0.0.11)[Computer software]. https://doi.org/10.5281/zenodo.7056278.