A Python package to work with the HPO Ontology


Keywords
hpo, phenotype, genotype, bioinformatics, rare, diseases, hpo-similarity, ontology
License
MIT
Install
pip install pyhpo==2.7.1

Documentation

image

image

image

image

PyHPO

A Python library to work with, analyze, filter and inspect the Human Phenotype Ontology

Visit the PyHPO Documentation for a more detailed overview of all the functionality.

Main features

  • πŸ‘« Identify patient cohorts based on clinical features
  • πŸ‘¨β€πŸ‘§β€πŸ‘¦ Cluster patients or other clinical information for GWAS
  • πŸ©»β†’πŸ§¬ Phenotype to Genotype studies
  • 🍎🍊 HPO similarity analysis
  • πŸ•ΈοΈ Graph based analysis of phenotypes, genes and diseases

PyHPO allows working on individual terms HPOTerm, a set of terms HPOSet and the full Ontology.

The library is helpful for discovery of novel gene-disease associations and GWAS data analysis studies. At the same time, it can be used for oragnize clinical information of patients in research or diagnostic settings.

Internally the ontology is represented as a branched linked list, every term contains pointers to its parent and child terms. This allows fast tree traversal functionality.

It provides an interface to create Pandas Dataframe from its data, allowing integration in already existing data anlysis tools.

Hint

Check out hpo3 (Documentation) for an alternative implementation. hpo3 has the exact same functionality, but is much faster πŸš€ and supports multithreading for even faster large data processing.

Getting started

The easiest way to install PyHPO is via pip

This will install a base version of PyHPO that offers most functionality.

Note

Some features of PyHPO require pandas and scipy. The standard installation via pip will not include pandas or scipy and PyHPO will work just fine. (You will get a warning on the initial import though).

Without installing pandas, you won't be able to export the Ontology as a Dataframe, everything else will work fine.

Without installing scipy, you won't be able to use the stats module, especially the enrichment calculations.

If you want to do enrichment analysis, you must also install scipy.

If you want to work with PyHPO using pandas dataframes, you can install the pandas dependency

Or simply install both together:

Usage example

Basic use cases

Some examples for basic functionality of PyHPO

How similar are the phenotypes of two patients

How close are two HPO terms

HPOTerm

An HPOTerm contains various metadata about the term, as well as pointers to its parents and children terms. You can access its information-content, calculate similarity scores to other terms, find the shortest or longes connection between two terms. List all associated genes or diseases, etc.

Examples:

Basic functionalities of an HPO-Term

(This script is complete, it should run "as is")

Some additional functionality, working with more than one term

(This script is complete, it should run "as is")

Ontology

The Ontology contains all HPO terms, their connections to each other and associations to genes and diseases. It provides some helper functions for HPOTerm search functionality

Examples

(This script is complete, it should run "as is")

The Ontology is a Singleton and should only be initiated once. It can be reused across several modules, e.g:

main.py

module2.py

HPOSet

An HPOSet is a collection of HPOTerm and can be used to represent e.g. a patient's clinical information. It provides APIs for filtering, comparisons to other HPOSet and term/gene/disease enrichments.

Examples:

(This script is complete, it should run "as is")

Get genes enriched in an HPOSet

Examples:

(This script is complete, it should run "as is")

For a more detailed description of how to use PyHPO, visit the PyHPO Documentation.

Contributing

Yes, please do so. We appreciate any help, suggestions for improvement or other feedback. Just create a pull-request or open an issue.

License

PyHPO is released under the MIT license.

PyHPO is using the Human Phenotype Ontology. Find out more at http://www.human-phenotype-ontology.org

Sebastian KΓΆhler, Leigh Carmody, Nicole Vasilevsky, Julius O B Jacobsen, et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Research. (2018) doi: 10.1093/nar/gky1105