umiAnalyzer

Tools for Analyzing Sequencing Data with Unique Molecular Identifiers


Keywords
targeted-sequencing, unique-molecular-identifiers, variant-analysis
License
GPL-3.0

Documentation

umiAnalyzer 1.0.0 (23-11-2021)

Tools for analyzing sequencing data containing unique molecular identifiers generated by UMIErrorCorrect (https://github.com/stahlberggroup/umierrorcorrect). The package allows merging of multiple samples into a single UMIexperiment object which can be easily manipulated using build-in functions to generate tabular and graphical output. The package includes a shiny app with a graphical user interface for data exploration and generating plots and report documents.

This README serves as a basic introduction, for more detailed information see the R vignette:

browseVignettes('umiAnalyzer')

Requirements

  • R (>= 4.1.0), which can be downloaded and installed via The Comprehensive R Archive Network CRAN.
  • Installation from R using install_github requires the devtools package
  • Running the shiny app also requires additional packages (see below)

Installation

Install the current stable version from CRAN or GitHub or the latest development version from GitHub.

# from CRAN (not supported yet)
#install.packages('umiAnalyzer')

# Current stable version from github using the devtools package:
devtools::install_github('sfilges/umiAnalyzer')

# Latest development version from github:
devtools::install_github('sfilges/umiAnalyzer', ref = 'devel')

Running the visualization app

To use the Shiny app the following R packages need to be installed:

pkgs <- c('tidyverse', 'shinydashboard', 'shinyFiles', 'shinyWidgets', 'DT')

install.packages(pkgs)

Run the following command in the R console to start the app:

umiAnalyzer::runUmiVisualizer()

You can choose to upload data from your computer from within the app at any point, but you can also specify a directory containing UMIErrorCorrect output directly when launching the app using the path argument.

umiAnalyzer::runUmiVisualiser(path = 'path_to_data')

Using the R package in your own scripts

How to make build your own UMIexperiment object

Define a variable containing the path to the directory with all the UMIErrorCorrect output folders belonging to your experiment. umiAnalyzer comes with raw test data generated with UMIErrorCorrect that you can import if you don't have any of your own.

Call the createUmiExperiment to create your UMIexperiment object.

The UMIexperiment object always maintains your raw data, however you can create as many filters as you like, which will be saved as separate objects to access. You can filter the consensus table of UMIexperiment object with filterUMIobject. The only mandatory arguments are the object to be filtered and a user defined name. You can use that name to retrieve a filtered table using getFilter.

library(umiAnalyzer)

main <- system.file('extdata', package = 'umiAnalyzer')

simsen <- createUmiExperiment(main)

reads <- parseBamFiles(main, consDepth = 10)

plotFamilyHistogram(reads)

simsen <- generateQCplots(simsen)

simsen <- filterUmiObject(simsen)

myfilter <- getFilteredData(simsen)
myfilter

simsen <- generateAmpliconPlots(simsen)

Importing experimental designs and statistics

Experimental design

umiAnalyzer supports adding meta data to a UMIexperiment object, such as experimental design matrices or clinical parameters. This is done using the importDesign function and requires a simple formatted table supplied by the user as a tab separated file. It is important that the order of the samples in the meta data file is the same as when building the UMIexperiment object.

metaData <- system.file('extdata', 'metadata.txt', package = 'umiAnalyzer')

simsen <- importDesign(
  object = simsen,
  file = metaData
)

design <- getMetaData(
  object = simsen, 
  attributeName = 'design'
)

design

Merging data

Merge technical replicates for statistics

The mergeTechnicalReplicates function will result in a merged data set accessible from the UMIexperiment object using objectmerged.data. This is meant to provide statistical information across multiple replicates. If you want to merge multiple sequencing runs of the sample into a single sample using the collapseReplicates function instead.

simsen <- mergeTechnicalReplicates(
  object = simsen,
  group.by = 'replicate'
)

viewNormPlot(simsen)

Working with meta data

It is also possible to add meta data to an object and to retrieve metadata if needed. The design matrix loaded with importDesign can be retrieved as follows:

design <- getMetaData(
  object = data,
  attributeName = 'design'
)

design

Similarly, any kind of meta data can be added and retrieved from an object using addMetaData:

comment <- 'fix this'
data <- addMetaData(
  object = data,
  attributeName = 'my-comment',
  attributeValue = comment
)

myattribute <- getMetaData(
  object = data,
  attributeName = 'my-comment'
)

myattribute

Generating VCF output (beta)

Generates a VCF file in the current working directory, another output directory can be specified using the outDir parameter. The printAll parameter specifies whether all variants should be printed or only those with at least 5 reads as a support (default = FALSE).

generateVCF(object = exp1, outFile = 'myVCF')