qtlsearch
Release 1.0.2

Uses OMA's HOGs to predict genes related to QTL.

License: LGPL-3.0
Install: pip install qtlsearch==1.0.2

Documentation

QTLSearch

QTLSearch is piece of software to search for candidate causal genes in QTL studies by combining Gene Ontology annotations across many species, leveraging hierarchical orthologous groups.

First, a QTLSearch database is built (using qtlsearch-init) which contains OMA HOGs annotated with GO / ChEBI terms from the latest releases. This is based on a QTL and mapping file which the user provides.

Then, the search is performed using qtlsearch-run. A single database can be built for multiple QTL and then the search performed on these individually, enabling the use of a job scheduler. See examples from the paper to see how this was done using LSF.

Installation

Requires Python >= 3.6. Download the package from the PyPI, resolving the dependencies by using pip install qtlsearch.

Alternatively, clone this repository and install manually.

qtlsearch-init -- Building a QTLSearch Database

Creates a database file qtlsearch.db in the current directory.

Usage

Required arguments: --species, --qtl, --annotation_map.

 usage: qtlsearch-init [-h] [--no_api_cache] [--api_cache_path API_CACHE_PATH]
                      [--api_endpoint API_ENDPOINT]
                      [--oma_version OMA_VERSION] [--data_path DATA_PATH]
                      --species SPECIES [SPECIES ...] --qtl QTL [QTL ...]
                      --annotation_map ANNOTATION_MAP [ANNOTATION_MAP ...]

Arguments

Quick reference table

Flag	Default	Description
`--species`		Species to build database for.
`--qtl`		List of paths to QTL files.
`--annotation_map`		Mapping from trait in QTL file to annotation IDs (e.g., GO, ChEBI).
`--data_path`	`./data`	Path to store downloaded data.
`--api_cache_path`	`./api_cache`	Path to store OMA API cache.
`--no_api_cache`		Boolean whether to cache API calls.
`--oma_version`	Current	Enables use of old releases, through database download.

Descriptions

`--species`

List of species to build the database for. Input as UniProt species codes (e.g., ARATH for Arabidopsis thaliana). These must be in the same order as the files in --qtl.

`--qtl`

List of paths to QTL file(s). These are tab-separated value files, format: <Trait, Chromosome, Start (Mbp), End (Mbp)>. An optional first column can be added containing QTL IDs, otherwise the ID will correspond to the line number in the results. Note: the files must not have header names; the file names must be in the same order as the species in --species.

`--annotation_map`

Mapping from trait in QTL file to annotation IDs (e.g., GO, ChEBI). These are tab-separated value files, format: <Trait, GO Term, ChEBI Term>. Note: the files must have header names "trait", "go", "chebi". The "chebi" column may be omitted if not required.

`--data_path`

Path to store downloaded data. This is safe to share between multiple runs of qtlsearch-init, unless the OMA database has been updated. Any updated species co-ordinates would then be incorrect.

`--api_cache_path`

Path to store OMA API cache. The expectation is to use this to share the data between multiple runs of qtlsearch-init.

`--no_api_cache`

Boolean whether to cache API calls. If set, this will not create the directory for persistent caching. If you don't expect to build multiple databases, it is best to disable this.

`--oma_version`

Enables use of old releases. Set to string of release, e.g., All.Sep2014 to use the species and HOGs from the September 2014 release of the OMA browser. This will download a very large database, instead of using the API.

Example Database Build

Here, a smaller database shall be built using the QTL, associated with Fructose or Galactose abundance, from the Lisec et al. dataset (in Arabidopsis thaliana [ARATH]) used in the paper.

Note: as we download (and cache) files from the UniProt-GOA and ChEBI databases, this will take quite some time. These files can be reused for further database builds, however.

Two files are required: one listing our trait to GO / ChEBI terms; the other listing the QTL.

QTL Listing

The following table shows the QTL listing, this is then stored as a tab seperated value formatted file, without headers, as qtl.tsv.

QTL ID	Trait (Metabolite)	Chromosome	Start (Mbp)	End (Mbp)
12	fructose	5	21.871688	24.169319
13	fructose	5	21.871688	23.594912
63	galactose	5	8.947509	11.245141
111	fructose	4	3.562645	7.833454
112	fructose	5	16.414812	17.563628

Trait Mapping

The following table shows the trait (metabolite) mapping used in the paper, this is then stored as a tab seperated value formatted file, with headers, as mapping.tsv.

trait	go	chebi
fructose	`GO:0046370`	`CHEBI:28757`
galactose	`GO:0046369`	`CHEBI:28260`

Build

The database can then be built as so:

qtlsearch-init --species ARATH --qtl qtl.tsv --annotation_map mapping.tsv

This will retrieve the current GO and ChEBI annotations which can be reused to build extra QTLSearch databases. The OMA API will be used to download the species protein co-ordinates required.

A qtlsearch.db file will be the output, placed in the current directory. This can then be used to search for your QTL in the next step.

qtlsearch-run -- Running QTLSearch

It is necessary to build a QTLSearch database prior to running the search. This is explained above.

Note: a single database can be built for multiple QTL and then the search performed on these individually, enabling the use of a job scheduler. See examples from the paper to see how this was done using LSF.

Usage

Required arguments: --species, --qtl, --db.

usage: qtlsearch-run [-h] [--data_path DATA_PATH] --species SPECIES
                 [SPECIES ...] --qtl QTL [QTL ...] --db DB [--with_p]
                 [--replicates REPLICATES] [--results RESULTS]

Arguments

Quick reference table

Flag	Default	Description
`--db`		Path to database created with `qtlsearch-init`.
`--species`		Species corresponding to QTL files.
`--qtl`		List of paths to QTL files.
`--results`	`./results.tsv`	Path to print the results table to.
`--with_p`		Boolean whether to compute empirical p-values.
`--replicates`	1,000	Number of replicates for computing empirical distribution.
`--data_path`	`./data`	Path where downloaded data was stored.

Descriptions

`--db`

Path to the database that is output from qtlsearch-init. If everything is completed in the same directory, without renaming the database file, this will be "qtlsearch.db".

`--species`

List of species to build the database for. Input as UniProt species codes (e.g., ARATH for Arabidopsis thaliana). These must be in the same order as the files in --qtl.

`--qtl`

`--results`

Path to print the results table to. Defaults as ./results.tsv. This is in TSV format and includes a header of column names in the first row.

`--with_p`

Boolean whether to compute the empirical p-values. This does take a long time, but the tool may still be useful without running this.

`--replicates`

Number of replicates for computing the empirical distribution, in order to estimate the p-values.

`--data_path`

Path to where the downloaded data was stored. This should be set to the same as it was for the call to qtlsearch-init.

Example Run

Following on from the example database build, we can now run to search for potential causal genes in the QTL associated with Fructose / Galactose abundance.

qtlsearch-run --species ARATH --qtl qtls.tsv --db qtlsearch.db

This will, by default, create the output in a file results.tsv in the current directory.

License

QTLSearch is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

QTLSearch is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with QTLSearch. If not, see http://www.gnu.org/licenses/.

Dependencies: 13
Dependent packages: 0
Dependent repositories: 0
Total releases: 4
Latest release: Feb 8, 2019
First release: Apr 11, 2018
Forks: 0
Watchers: 1
Contributors: 0
Repository size: 572 KB
SourceRank: 8

Source repo 2FA enabled: TEXT!
Package manager 2FA enabled: TEXT!
Is security responsive: TEXT!
Dependencies are managed: TEXT!
Issue-free release available: TEXT!
Succession plan available: TEXT!
Package manager 2FA enabled: TEXT!

Releases

1.0.3: Feb 8, 2019
1.0.2: Jan 8, 2019
1.0.1: Apr 11, 2018
1.0.0: Apr 11, 2018

Something wrong with this page? Make a suggestion

Export .ABOUT file for this package

Last synced: 2021-02-20 22:35:13 UTC

qtlsearch
Release 1.0.2

Release 1.0.2

1.0.3

1.0.2

1.0.1

1.0.0

Documentation

QTLSearch

Installation

qtlsearch-init -- Building a QTLSearch Database

Usage

Arguments

Quick reference table

Descriptions

`--species`

`--qtl`

`--annotation_map`

`--data_path`

`--api_cache_path`

`--no_api_cache`

`--oma_version`

Example Database Build

QTL Listing

Trait Mapping

Build

qtlsearch-run -- Running QTLSearch

Usage

Arguments

Quick reference table

Descriptions

`--db`

`--species`

`--qtl`

`--results`

`--with_p`

`--replicates`

`--data_path`

Example Run

License

Stats

Development practices

Releases

qtlsearch Release 1.0.2

Release 1.0.2 Toggle Dropdown 1.0.3 1.0.2 1.0.1 1.0.0

Documentation

QTLSearch

Installation

qtlsearch-init -- Building a QTLSearch Database

Usage

Arguments

Quick reference table

Descriptions

--species

--qtl

--annotation_map

--data_path

--api_cache_path

--no_api_cache

--oma_version

Example Database Build

QTL Listing

Trait Mapping

Build

qtlsearch-run -- Running QTLSearch

Usage

Arguments

Quick reference table

Descriptions

--db

--species

--qtl

--results

--with_p

--replicates

--data_path

Example Run

License

Stats

Development practices

Releases

qtlsearch
Release 1.0.2

Release 1.0.2

1.0.3

1.0.2

1.0.1

1.0.0

`--species`

`--qtl`

`--annotation_map`

`--data_path`

`--api_cache_path`

`--no_api_cache`

`--oma_version`

`--db`

`--species`

`--qtl`

`--results`

`--with_p`

`--replicates`

`--data_path`