exoclasma-index

Reference Sequence and Capture Intervals Preparation for ExoClasma Suite


Keywords
bed, bioinformatics, fasta, indexing, restrictase
License
GPL-3.0
Install
pip install exoclasma-index==0.9.5

Documentation

exoclasma-index

Description

exoclasma-index is a tools for reference sequence and capture intervals preparation, a part of upcoming ExoClasma Suite.

Features:

  • Prepare FASTA reference sequence (purge names, uncompress, etc.)
  • Create restriction sites for Juicer as described at config.json. For now available restrictases include:
    • HindIII
    • DpnII
    • MboI
    • Sau3AI
    • Arima
  • Create indices for:
    • SAMtools (samtools faidx)
    • BWA (bwa index)
    • optional: GATK (gatk CreateSequenceDictionary)

Also, the tool can check and prepare exome BED files (captures).

This is a pre-release. Use it at your own risk!

Installation

python3 -m pip install exoclasma-index

Command-line dependencies

First three are available at Ubuntu repos:

apt install samtools bwa bedtools

GATK should be installed into Miniconda environment as described by the developer.

Usage

Reference preparation

exoclasma-index Reference -f ${FastaFile} -n ${ReferenceName} -p ${ParentDirectory}

Optional: -d ${ReferenceDescription}, --no-gatk

Capture preparation

exoclasma-index Capture -b ${BedFile} -n ${CaptureName} -g ${GenomeInfoJSON}

GenomeInfoJSON is a JSON file which created via exoclasma-index Reference.

Optional: -d ${ReferenceDescription}