Library for working with biological sequence data as numpy arrays.


Keywords
bionumpy, bioinformatics, numpy, python
License
MIT
Install
pip install bionumpy==1.0.10

Documentation

BioNumPy

image

Build and test status

Benchmarks

Documentation: https://bionumpy.github.io/bionumpy/

What is BioNumPy?

BioNumPy is a Python library, built on top of NumPy, for enabling array programming on biological datasets in Python. BioNumPy aims to make it easy to read common bioinformatics file formats efficiently into NumPy-like data structures that enable efficient operations and analysis of the data. Working in BioNumPy should feel much like working in NumPy.

Getting started

  1. Install with pip:

    $ pip install bionumpy

  2. Import BioNumPy and read your data, e.g.:

    >>> import bionumpy as bnp >>> import numpy as np >>> f = bnp.open("example_data/big.fq.gz") >>> # chunk contains the sequences of reads and NumPy-functions can be used >>> for chunk in f: ... print(np.sum(chunk.sequence == "G")) 53686

Check out the getting started guide and various tutorials in the documentation.

You can also find the latest version of the BioNumPy manuscript here.