Learn R Programming

h5vc (version 2.6.3)

mismatchPlot: mismatchPlot

Description

Plotting function that returns a ggplot2 object representing the mismatches and coverages of the specified samples in the specified region.

Usage

mismatchPlot( data, sampledata, samples=sampledata$Sample, windowsize = NULL, position = NULL, range = NULL, plotReference = TRUE, refHeight=8, tickSpacing = c(10,10) )

Arguments

data
The data to be plotted. Returned by h5dapply or h5readBlock.
sampledata
The sampledata for the cohort represented by data. Returned by getSampleData
samples
A character vector listing the names of samples to be plotted, defaults to all samples as described in sampledata
windowsize
Size of the window in which to plot on each side. The total interval that is plotted will be [position-windowsize,position+windowsize]
position
The position at which the plot shall be centered
range
Integer vector of two elements specifying a range of coordinates to be plotted, use either position + windowsize or range; if both are provided range overwrites position and windowsize.
plotReference
This boolean flag specifies if a reference track should be plotted, only takes effect if there is a slot named Reference in the data object passed to the function
refHeight
Height of the reference track in coverage units (default of 8 = reference track is as high as 8 reads coverage would be in the plot of a sample.)
tickSpacing
Integer vector of two elements, specifying the spacing of ticks along the x and y axes respectively.

Value

  • A ggplot object containing the mismatch plot, this can be used like any other ggplot object, i.e. additional layers and styles my be applied by simply adding them to the plot.

Details

If position and windowsize are specified this function creates a plot centered on position using the coverage and mismatch counts stored in data, annotating it with sample information provided in the data.frame sampledata and showing all samples listed in sample. If range is specified, the plot will cover the positions from range[1] to range[2]. The difference between specifying range or position plus windowsize lies only in the labelling of the x-axis and the coordinate system used on the x-axis. In the former case the coordinate system is that of genomic coordinates as specified in range, when using the latter the x-axis coordinates go from -windowsize through +windowsize and position 0 is marked with the calue provided in the position parameter. Furthermore when a position and windowsize are provided two black lines marking the center position are drawn (this is usefull for visualising SNVs) If neither range, nor position and windowsize are specified the function will try to extract the information from the data object. If data is the return value of a call to h5dapply or h5readBlock this will work automagically. The plot has the genomic position on the x-axis. The y-axis encodes values where positive values are on the forward strand and negative values on the reverse. The coverage is shown in grey, deletions in purple and the mismatches in the colors specified in the legend. Note that for each possible mismatch there is an additional color for low-quality counts (coming from the first and last sequencing cycles), so e.g. C is filled dark red and C_lq light red. If data is the result of a call to h5dapply representing multiple blocks of data as defined in the range parameter to h5dapply then the plot will contain the mismatchPlots of each of the ranges plotted next to each other.

Examples

Run this code
# loading library and example data
  library(h5vc)
  tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
  sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
  position <- 29979628
  windowsize <- 30
  samples <- sampleData$Sample[sampleData$Patient == "Patient8"]
  data <- h5readBlock(
    filename = tallyFile,
    group = "/ExampleStudy/16",
    names = c("Coverages", "Counts", "Deletions", "Reference"),
    range = c(position - windowsize, position + windowsize)
  )
  #Plotting with position and windowsize
  p <- mismatchPlot(
    data = data,
    sampledata = sampleData,
    samples = samples,
    windowsize = windowsize,
    position = position
  )
  print(p)
  #plotting with range and modified tickSpacing and refHeight
  p <- mismatchPlot(
    data = data,
    sampledata = sampleData,
    samples = samples,
    range = c(position - windowsize, position + windowsize),
    tickSpacing = c(20, 5),
    refHeight = 5
  )
  print(p)
  #plotting without specfiying range or position
  p <- mismatchPlot(
    data = data,
    sampledata = sampleData,
    samples = samples
  )
  print(p)
  #Plotting multiple regions (with small overlaps)
  library(IRanges)
  dataList <- h5dapply(
    filename = tallyFile,
    group = "/ExampleStudy/16",
    names = c("Coverages", "Counts", "Deletions", "Reference"),
    range = IRanges(start = seq( position - windowsize, position + windowsize, 20), width = 30 )
  )
  p <- mismatchPlot(
    data = dataList,
    sampledata = sampleData,
    samples = samples
  )
  print(p)

Run the code above in your browser using DataLab