EmulsiPred

A package to predict emulsifying potential of peptides


License
MIT
Install
pip install EmulsiPred==0.0.1.1

Documentation

EmulsiPred

Tool for prediction of emulsifying peptides. EmulsiPred predicts the emulsifying property of either a single peptide or for any peptide within a protein sequences. Three emulsifying scores are calculated for each peptide as described by García-Moreno P.J. et al., 2020, with a peptide defined as a sequence of 7-30 amino acids.

EmulsiPred takes as input a fasta file or a NetSurfP (2 or 3) result file. The NetSurfP-2 file should be in the NetSurfP-1 Format (retrieved when clicking 'Export All' in the upper right side of NetSurfP's 'Server Output' window). For a fasta file with protein sequences, EmulsiPred will return scores for each peptide found within the protein sequences. If given a NetSurfP result file, EmulsiPred will only return the alpha and beta scores for peptides present in either an alpha helix or beta sheet, predicted by NetSurfP.

Prerequisites and installation

The package can either be installed with pip or from github. In both cases, python-3.9 or higher needs to be installed in your environment. Additionally, it is recommended to install the package in a new environment.

The following commands are run in the command line.

1: Set up a new environment.

    python3 -m venv EmulsiPred_env

2: Enter (activate) the environment.

    source EmulsiPred_env/bin/activate

3a: Install EmulsiPred within the activated environment with pip.

    pip install EmulsiPred

3b: Install EmulsiPred by installing from github with pip.

    pip install "git+https://github.com/MarcatiliLab/EmulsiPred.git"

After either running 3a or 3b, EmulsiPred is installed within the activated environment (in our case EmulsiPred_env).


Running EmulsiPred

After installation, EmulsiPred can be run from the terminal or within a python script.

As mentioned above, EmulsiPred requires a fasta file containing the protein sequences to check for emulsifiers or a NetSurfP file containing secondary structure information of each sequences.

Additionally, there are also five additional parameters.

  1. -n (netsurfp_results): Whether the input is a NetSurfP file (default is False)
  2. -p (peptides): Whether the input are peptides and therefore shouldn't be cleaved into peptides (default is False)
  3. -o (out_dir): Output directory (default is the current directory).
  4. --nr_seq (nr_seq): Results will only include peptides present in this number of sequences or higher (default 1).
  5. --ls (lower_score): Results will only include peptides with a score higher than this score (default 2).

EmulsiPred can be run directly in the terminal with the following command.

    python -m EmulsiPred -s path/to/sequence.fsa -n False -p False -o path/to/out_dir --nr_seq 1 --ls 2

Furthermore, it can be imported and run in a python script.

import EmulsiPred as ep

ep.EmulsiPred(sequences='path/to/sequence.fsa', netsurfp_results=False, peptides=False, out_dir='path/to/out_dir', nr_seq=1, lower_score=2)

Interpretation of predictions

The predicted values are a relative ordering of the peptides by chance of being an emulsifier. In other words, a higher score implies a higher chance of being an emulsifier.