strform

STR sequence compression and formatting


Keywords
forensic, STR, bracketed, repeat, formatting
License
Other
Install
pip install strform==0.0.1

Documentation

    #### #######
   ###          
   ###   #######
   ###   ##     
#####    ##     
                             

Project STRform: STR sequence compression and formatting

STRform is a package of algorithms for the conversion of forensic short tandem repeat (STR) sequences into various formats, such as the bracketed repeat format for easier reading.

This page and tool is under construction:

  • Testing and validation scripts
  • Additional functionalities

Algorithm overview

To find STRs on a read, the tool performs the following procedures:

  • Using a sliding window, find repeating motifs and record the number of times repeated
  • Repeat stretches on the sequence is converted to bracket format

Installation

STRform was developed in Python 3.6. It is recommened that the program be installed in a Conda virtual environment.

STRform is now available on PyPI and can be installed via pip:

pip install strform

Installing using the pip command will also install the required packages automatically.

Usage

There are currently 2 basic functions for conversion of sequences to and from bracketed formats:

import strform.brackets as sfb # Import the STRform brackets module

sfb.condense_repeats('AATGAATG') # Converts AATGAATG to [AATG]2

sfb.expand_repeats('[AATG]2') # Converts [AATG]2 to AATGAATG

Release notes

v0.0.1

  • Test version upload to PyPI

Alexander YY Liu | yliu575@aucklanduni.ac.nz