Immunological peptide datasets and amino acid properties

pip install pepdata==1.0.7


Build Status Coverage Status PyPI


Formerly a repository for diverse peptide datasets, now only contains the Immune Epitope Database and a variety of amino acid property matrices. This package will probably be eventually split and the IEDB portions placed into something named pyiedb.

Amino Acid Properties

The amino_acid module contains a variety of physical/chemical properties for both single amino residues and interactions between pairs of residues.

Single residue feature tables are parsed into StringTransformer objects, which can be treated as dictionaries or will vectorize a string when you call their method transform_string.

Examples of single residue features:

  • hydropathy
  • volume
  • polarity
  • pK_side_chain
  • prct_exposed_residues
  • hydrophilicity
  • accessible_surface_area
  • refractivity
  • local_flexibility
  • accessible_surface_area_folded
  • alpha_helix_score (Chou-Fasman)
  • beta_sheet_score (Chou-Fasman)
  • turn_score (Chou-Fasman)

Pairwise interaction tables are parsed into nested dictionaries, so that the interaction between amino acids x and y can be determined from d[x][y].

Pairwise interaction dictionaries:

  • strand_vs_coil (and its transpose coil_vs_strand)
  • helix_vs_strand (and its transpose strand_vs_helix)
  • helix_vs_coil (and its transpose coil_vs_helix)
  • blosum30
  • blosum50
  • blosum62

There is also a function to parse the coefficients of the PMBEC similarity matrix, though this currently lives in the separate pmbec module.