viral_verify
viralVerify rewrite/refactor for PyPI packaging and distribution, maintainability and clarity.
NOTE: BLAST+ search option has been removed. Results output table will be different than the original viralVerify. Naive Bayes classifier training script has not been ported yet.
- Free software: MIT license
- Documentation: https://viral-verify.readthedocs.io.
Features
- Gene prediction with Prodigal in metagenomic mode
-
HMMer3
hmmsearch
for protein domains in predicted genes - Naive Bayes classification of contigs as viral/not viral based on HMMer3 results
- Output of detailed contig classification results table in CSV format
- Output of contigs based on classification into separate FASTA files
Requirements
An HMMer3 HMM database is required. For example, the latest version of Pfam-A HMM:
NOTE: Please extract any compressed HMM DB ($ gunzip Pfam-A.hmm.gz
)
Software dependencies:
Python dependencies:
Installation
Conda
It's recommended that you use Conda to install the required software (Prodigal and HMMer3) and Python dependencies.
$ conda env create -f environment.yml
Pip
If you have Prodigal and HMMer3 installed in your $PATH
, and Python 3.6 or greater, you can use pip
to install viral_verify
:
$ pip install viral_verify
Usage
$ viral_verify --help Usage: viral_verify [OPTIONS] HMM and Naive Bayes classification of contig sequences as either viral, plasmid or chromosomal. Requires Prodigal for gene prediction and hmmsearch from HMMer3 for searching for Pfam HMM profiles. Options: -i, --input-fasta PATH Input fasta file [required] -o, --outdir PATH Output directory [required] -H, --hmm-db PATH Path to Pfam-A HMM database [required] -t, --threads INTEGER Number of threads (default=16) -p, --output-plasmids-separately Output predicted plasmids separately? --prefix TEXT Output file prefix (default: None) --uncertainty-threshold FLOAT Uncertainty threshold (Natural log probability) (default=3.0) --naive-bayes-classifier-table PATH Table of protein domain frequencies to use for Naive Bayes classification (default="/ho me/pkruczkiewicz/repos/viral_verify/viral_ve rify/data/classifier_table.txt") -v, --verbose Logging verbosity --version Show the version and exit. --help Show this message and exit.
Credits
The original source code, design and conception can be found at viralVerify. This is merely a rewrite for easier packaging via PyPI, adding some CI with Travis-CI and organizing the code for maintainability and clarity.
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.