Note: This package is in active development and functionality might change or not work correctly (yet)!
Dynamic machine learning database for genomics. Supports common bed-like dataformats like .bed, and .narrowPeak. bedgraph; and the binary bigwig format.
PeakSQL can be installed through pip:
pip install peaksql
Or installed from source:
git clone https://github.com/vanheeringen-lab/peaksql cd peaksql pip install .
import peaksql # paths to our files db_file = 'peakSQL.sqlite' # where to store our database assembly = "/path/to/hg38.fa" data = "binding_sites.bed" # load data into database db = peaksql.database.DataBase(db_file) db.add_assembly(assembly, assembly="hg38", species="human") db.add_data(data, assembly="data") # now load as dataset dataset = peaksql.BedRegionDataSet(db_file, seq_length=101, stride=200) # use the dataset in your application for seq, label in dataset: ...