A Modeller-based pipeline to generate homo-oligomers.


Keywords
PDB, structure, protein, oligomerization, oligomers, multimer
License
CC-BY-4.0
Install
pip install ProtCHOIR==1.2.5

Documentation

https://raw.githubusercontent.com/monteirotorres/ProtCHOIR/master/ProtCHOIR/Contents/ProtCHOIR.svg?sanitize=true

ProtCHOIR

This pipeline was devised to create homo-oligomeric structures based on selected subsets of the PDB databank.

With ProtCHOIR you can supply either a sequence in FASTA format or a protomeric structure in the PDB format to obtain homo-oligomeric models based on homologues.

Prerequisites

The following packages and external programs are used by ProtCHOIR scripts and must be installed and in either the binaries path or python path.

Python packages

  • progressbar2
  • pandas
  • biopython
  • pathlib
  • parasail
  • networkx
  • jinja2
  • numpy
  • matplotlib

External software (must be installed separately)

Note: PISA, GESAMT and MolProbity may be installed as part of the CCP4 Software Suite

Installation

The scripts are available as a PyPi project. Just install them with:

pip install ProtCHOIR

Initial Setup

If that is the first time you are running ProtCHOIR and you do not provide a configuration file (with --conf), the program will ask whether you desire the configuration file to be created. This configuration file simply has the paths to all the external software that are necessary.

The file also contains the path to a locally generated database (referred to as "choirdb") in which it will look for possible homo-oligomeric proteins to serve as templates for modelling.

Make sure that the directory to which the choirdb variable is pointing actually exists.

The choirdb must be created locally and is a lengthy process whose total duration will depend on the processing capabilities of your machine. In the process, the whole pdb database will be downloaded, analysed and sorted in the expected directories.

Initial creation of the local database can be done with:

ProtCHOIR -v -u --conf conf_file

Subsequent updates will not re-download and re-analyse the whole pd, but only the new (or updated) entries.

Usage

After the initial database set-up, you may run the program normally via command line, by invoking the ProtCHOIR executable and providing an input file either in PDB or FASTA format.

ProtCHOIR -v -f protomer.pdb --conf conf_file

To generate a full html report with detailed model analysis as output, run the program with:

ProtCHOIR -v -f protomer.pdb --generate-report --conf conf_file

Running:

ProtCHOIR -h

Will expose all available runtime options.

Methodology Flowchart

The image below summarizes the approach used by ProtCHOIR to build the homo-oligomeric proteins.

https://raw.githubusercontent.com/monteirotorres/ProtCHOIR/master/ProtCHOIR/Contents/ProtCHOIRScheme.svg?sanitize=true

Authors

Pedro Torres, Ph.D; Sony Malhotra, Ph.D; Tom Blundell, FRS, FMedSci.

Department Of Biochemistry University of Cambridge 80 Tennis Court Road Cambridge CB2 1GA

License

This project is licensed under Creative Commons license (CC-BY-4.0), provided along with the package - see LICENSE.