pysamiterators

Pysam related iterators


Keywords
mate-pair, pysam, iteration
License
MIT
Install
pip install pysamiterators==1.9

Documentation

Pysam Iterators

This module contains functions and iterators which help in using Pysam

Functionality

  • Iteration over reads belonging to the same fragment (R1, R2)
  • Caching of pysam.FastaFile reference sequences to memory for faster access
  • get_aligned_pairs using a supplied fasta file when the MD tag doesn't match.
  • get_aligned_pairs with additional sequencer cycle

Prerequisites

The only prerequisites are Python 3.6 or higher and PySAM

Installation

pip3 install https://github.com/BuysDB/pysamiterators/archive/master.zip

Examples

Iteration over R1 and R2:

import pysamiterators.iterators as pits
import pysam
with  pysam.AlignmentFile('test.bam') as bam:
    for R1,R2 in pits.MatePairIterator( bam ):
        pass

Iteration over R1 and R2, on chromosome 2: (Arguments are passed to pysam.AlignmentFile fetch)

import pysamiterators.iterators as pits
import pysam
with  pysam.AlignmentFile('test.bam') as bam:
    for R1,R2 in pits.MatePairIterator( bam, contig='chr2' ):
        print(R1,R2)

Iteration over query and reference base, where the reference base is extracted from a fasta file. Make sure the fasta file has been indexed (samtools faidx).

import pysamiterators.iterators as pits
import pysam
reference = pits.CachedFasta( pysam.FastaFile('test.fasta') )
with  pysam.AlignmentFile('test.bam') as bam:
    for R1,R2 in pits.MatePairIterator( bam, contig='chr2' ):
        break
    for read_index, reference_pos, reference_base in pits.ReferenceBackedGetAlignedPairs(R1,
        reference=reference,
        matches_only=True,
        with_seq=True ):
        print(read_index, reference_pos, reference_base)