#### #######
###
### #######
### ##
##### ##
Project STRform: STR sequence compression and formatting
STRform is a package of algorithms for the conversion of forensic short tandem repeat (STR) sequences into various formats, such as the bracketed repeat format for easier reading.
This page and tool is under construction:
- Testing and validation scripts
- Additional functionalities
Algorithm overview
To find STRs on a read, the tool performs the following procedures:
- Using a sliding window, find repeating motifs and record the number of times repeated
- Repeat stretches on the sequence is converted to bracket format
Installation
STRform was developed in Python 3.6. It is recommened that the program be installed in a Conda virtual environment.
STRform is now available on PyPI and can be installed via pip:
pip install strform
Installing using the pip command will also install the required packages automatically.
Usage
There are currently 2 basic functions for conversion of sequences to and from bracketed formats:
import strform.brackets as sfb # Import the STRform brackets module
sfb.condense_repeats('AATGAATG') # Converts AATGAATG to [AATG]2
sfb.expand_repeats('[AATG]2') # Converts [AATG]2 to AATGAATG
Release notes
v0.0.1
- Test version upload to PyPI
Alexander YY Liu | yliu575@aucklanduni.ac.nz