Physcraper

Physcraper


License
GPL-3.0
Install
pip install Physcraper==1.0.0

Documentation

physcraper

Gene tree updating

python scrape_align_est.py ott_study_id ott_tree_id seqaln matrix_type runname

Arguments: ott_study_id = OpenTree study identifier
ott_tree_id = Tree id from that study
seqaln = the sequence alignment that generated that tree
matrix_type = alignment matrix type (only tested with fasta so far)
runname = a name for this run

Study needs to be in Phylesystem, get the ott_study_id and ott_tree_id from the curator app.

example:

cd shard/ascomycota
python /home/ejmctavish/projects/otapi/physcraper/scrape_align_est.py pg_873 tree1679 tree1679.fas fasta ascomycota

right now needs to be run in own study dir, because it generates and looks for various files.

also, it blasts each seq, which means it is slooow and gets slower.

Dependencies

  • Papara
  • raxmlHPC

Python modules

  • peyotl (needs to be configured to access phylesystem)
  • Bio
  • Dendropy

Preprocessing steps:

Get alignement from ?treebase?, using orginal file: convert to fasta and break out to single gene: e.g. rpb2 3218-6016 (this info is in the nexus) using: preprocess.py input.nex output_stub start stop

e.g.

preprocess.py M4058.nexorg rpb2 3218 6016

This will write a fasta file with just your gene/region of interest