Package to get VEnCodes as in Macedo and Gontijo, 2019


Keywords
fantom5, genetics, intersectional
License
BSD-3-Clause
Install
pip install VEnCode==1.2.0

Documentation

Module for VEnCode-related projects based on FANTOM5 databases


PyPI GitHub release (latest by date including pre-releases) GitHub Release Date https://travis-ci.com/AndreMacedo88/VEnCode.svg?branch=master https://coveralls.io/repos/github/AndreMacedo88/VEnCode/badge.svg?branch=master PyPI - Python Version GitHub GitHub issues Documentation Status

This module contains classes and functions that perform intersectional genetics-related operations to find VEnCodes using any matrix of cell types (columns) vs regulatory elements or markers (rows).

Moreover, it contains particular methods to make use of the databases provided by the FANTOM5 consortium, namely the CAGE enhancer and transcription start site (TSS) databases.

For more information on the VEnCode technology, please refer to Macedo and Gontijo, GigaScience, 2020.

Getting started

These instructions are designed to:

  • Get you a copy of the project up and running on your local machine for development and testing purposes;
  • Install the VEnCode package in your python library environment for use in your projects.

Prerequisites

To effectively use this module you will need Python3 with a few external libraries installed in your machine. Check the requirements file. If you install the package with pip, it should resolve the library requirements for you.

Optionally, if you want to retrieve VEnCodes using the comprehensive FANTOM5 CAGE-seq data, you will have to download the unannotated TSS files from FANTOM5 consortium website. More specifically, for human, download this file for promoter analysis, and this one and the ID-sample name map for enhancers. Finally, download the curated sample category file.

Those 4 files are enough to find CAGE-based VEnCodes for human.

Installing

  1. Make sure you have the prerequisites;

If you want to edit the project:

  1. Fork this project.

You are now ready to go. Optionally, if you are using the FANTOM5 data instead of your own:

  1. Put the missing FANTOM5 prerequisite files (only the large TSS files are missing) in the directory called "Files".

If you are a user:

  1. Install VEnCode with pip:
pip install VEnCode

You are good to go. Optionally, if you are using the FANTOM5 data instead of your own:

  1. Put all the FANTOM5 prerequisite files in a directory of your choice and when creating DataTpmFantom5 objects remember to pass the argument:
files_path = "just put here the path to your file"

Using the module

There are several ways to use this module:

  1. To develop your own projects, import objects directly from VEnCode using, for example:
import VEnCode
object1 = VEnCode.DataTpm(...)
vencodes = VEnCode.Vencodes(object1, ...)
vencodes.next(amount=2)
vencodes.export("vencodes", ...)
  • You can see examples of some functions and objects being used at the VEnCode Capsule hosted in CodeOcean.
  1. To run the most relevant scripts, use the utility file process.py, which gives easy access to many scripts, for example:
python process.py get_vencodes Hepatocyte --algorithm heuristic
  1. Run any script by going to the "Scripts" folder inside the package and calling the script individually.

Running the Tests

Tests for this module can be run in several ways; some examples:

  1. In the command-line:

1.1. Using the process.py utility file to run all the tests in one go. This is easily done by running the following command inside the VEnCode module:

python process.py run_tests

1.2. Run python's standard module "unittest" in the tests directory to run each test individually. Basic example in command line:

python -m unittest test_internals

1.3. Another way to run each test individually is to install the nosetests python package and run nosetests in the tests directory. Basic example in command line:

nosetests test_internals.py
  1. By importing the VEnCode module in python:
from VEnCode import tests
tests.run_all_tests()

Documentation

  • The documentation on the main methods of this tool can be found in the official documentation.
  • To see some of the functions in action, refer to the VEnCode Capsule hosted at CodeOcean.
  • For more examples on how to use this module, we suggest going through the scripts folder inside this projects' python package. There, we take the VEnCode tool functions and methods and apply them, as seen in Macedo and Gontijo, GigaScience, 2020.
  • Finally, all the public methods are thoroughly documented in the methods' docstring itself.

Contributing

Please read CONTRIBUTING.rst for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use SemVer for versioning. For the versions available, see:

Authors

See also the list of contributors who participated in this project.

License

Refer to the file LICENSE.

Acknowledgements

  • Integrative Biomedicine Laboratory @ CEDOC, NMS, Lisbon (supported by FCT: UID/Multi/04462/2019; PTDC/MED-NEU/30753/2017; and PTDC/BIA-BID/31071/2017 and FAPESP: 2016/09659-3)
  • CEDOC: Chronic Diseases Research Center, Nova Medical School, Lisbon
  • The MIT Portugal Program (MITEXPL/BIO/0097/2017)
  • LIGA PORTUGUESA CONTRA O CANCRO (LPCC) 2017.
  • FCT (IF/00022/2012, SFRH/BD/94931/2013, PTDC/BEXBCM/1370/2014)
  • Prof. Dr. Ney Lemke and Ms. Benilde Pondeca for important discussions.