DNA input and output library for Python and Cython. Includes reader and writer for FASTA and FASTQ files, support for samtools faidx files, and generators for solid and gapped q-grams (k-mers).


License
MIT
Install
pip install dinopy==3.0.0

Documentation

Dinopy - DNA input and output for python

https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat https://img.shields.io/pypi/v/dinopy.svg?style=flat

Dinopy's goal is to make files containing biological sequences easily and efficiently accessible for python programmers, allowing them to focus on their application instead of file-io.

#!python

import dinopy
fq_reader = dinopy.FastqReader("reads.fastq")
for sequence, name, quality in fq_reader.reads(quality_values=True):
    if some_function(quality):
        analyze(seq)

Features

  • Easy to use reader and writer for FASTA-, FASTQ-, and SAM-files.
  • Specifiable data type and representation for return values (bytes, bytearrays, strings and integers see dtype for more information).
  • Works directly on gzipped files.
  • Iterators for q-grams of a sequence (also allowing shaped q-grams).
  • (Reverse) complement.
  • Chromosome selection from FASTA files.
  • Implemented in Cython for additional speedup.

Getting Started

  • If you are new to dinopy you can get started by following the first-steps tutorial.
  • A full list of features, as well as the documentation, can be found here.

Installation

Dinopy can be installed with pip:

$ pip install dinopy

or with conda:

$ conda install -c bioconda dinopy

Additionally, dinopy can be downloaded from Bitbucket and compiled using its setup.py:

  1. Download source code from bitbucket.

  2. Install globally:

    $ python setup.py install
    

    or only for the current user:

    $ python setup.py install --user
    
  3. Use dinopy:

    $ python
    
    >>> import dinopy
    

Installation requirements

  • python >= 3.3
  • numpy >= 1.7
  • C and C++ compilers, for example from build-essentials (Linux) or Xcode (OSX)
  • Optional: cython >= 0.20

We recommend using anaconda and the bioconda channel.

$ conda config --add channels r
$ conda config --add channels bioconda
$ conda create -n dinoenv dinopy

Platform support

Dinopy has been tested on Ubuntu, Arch Linux and OS X (Yosemite and El Capitan).

We do not officially support Windows - dinopy will probably work, but there might be problems due to different linebreak styles; we assume \n as separator but the probability to encounter files with \r\n as line-separator might be higher on Windows.

Features in development

  • BAM-writer / reader

Planned features

  • GFF3 parser
  • Bisulfite arrays
  • quality-trimming for FASTQ parser

Contact

If you want to report a bug or want to suggest a new feature, feel free to do so over at bitbucket.

Email:
  • Henning Timm: name.surname <at> tu-dortmund.de
  • Till Hartmann: name.surname <at> tu-dortmund.de

License

Dinopy is Open Source and licensed under the MIT License.