R package to quantify, analyse and visualise alternative splicing data from The Cancer Genome Atlas (TCGA).
The following tutorial is available:
Other tutorials coming soon:
- Command-line interface
- Developers and other contributors
Install and start running
This package is not available in Bioconductor yet. To install and start using this program, follow these steps:
- Install R
- Open RStudio (or open a console,
Rand press enter)
- Type the following to install, load and start the visual interface:
install.packages("devtools") devtools::install_github("nuno-agostinho/psichomics") library(psichomics) psichomics()
Downloading TCGA data
You can download data from The Cancer Genome Atlas (TCGA) using this package. Simply choose the cohort of interest, date of the sample, type of interest and so on. Wait for the downloads to finish and then click again in Load Data to process and load the data.
Loading user files
To load your own files, simply choose the folder where the data is located. PSIchomics will try to process all the data contained in the given folder and sub-folders to search for files that can be loaded.
Exon/intron inclusion levels
To quantify alternative splicing based on the porportion of isoforms that include an exon, the Percent Spliced-In (PSI or Ψ) metric is used.
An estimate of this value is obtained based on the the proportion of reads supporting the inclusion of an exon over the reads supporting both the inclusion and exclusion of that exon. To measure this estimate, both alternative splicing annotation and junction quantification are required. While alternative splicing Human (hg19 assembly) annotation is already provided, junction quantification may be retrieved from TCGA.
The program performs survival and principal component analyses, as well as differential splicing analysis using non-parametric statistical tests.
Differential splicing analysis
Analyse alternative splicing quantification based on variance and median statistical tests. The groups available for differential analysis comprise sample types (e.g. normal versus tumour) and clinical attributes of patients (e.g. tumour stage).
Gene, transcript and protein information
For a given splicing event, examine its gene's annotation and corresponding transcripts and proteins. Related research articles are also available.
Principal component analysis (PCA)
Explore alternative splicing quantification groups using associated clinical attributes.
Analyse survival based on clinical attributes (e.g. tumour stage, gender and race). Additionally, study the impact of the quantification of a single alternative splicing event on patient survivability.
By column: automatically create groups by selecting a specific column of
the dataset; for instance, to create a group for each tumour stage, start typing
tumor_stage, select the appropriate field from the suggestions, click
Create groupand confirm that there is now one group for each stage.
- By row: input specific rows to create a group
- By subset expression: type a subset expression
- By GREP expression: apply a GREP expression over a specific column of the dataset
You can also select groups by clicking on them in order to merge, intersect or remove the groups.
All feedback on the program, documentation and associated material is welcome. Please, send any suggestions and comments to the following contact:
Special thanks to my lab colleagues for their work-related support and supporting chatter.