Learn R Programming

h5vc (version 2.6.3)

callVariantsFisher: Paired variant calling using fisher tests

Description

This function implements a simple paired variant calling strategy based on the fisher test

Usage

callVariantsPairedFisher(data, sampledata, pValCutOff = 0.05, minCoverage = 5, mergeDels = TRUE, mergeAggregator = mean)

Arguments

data
A list with elements Counts (a 4d integer array of size [1:12, 1:2, 1:k, 1:n]), Coverage (a 3d integer array of size [1:2, 1:k, 1:n]), Reference (a 1d integer vector of size [1:n]) -- see Details.
sampledata
A data.frame with k rows (one for each sample) and columns Type, Column and (Group or Patient). The tally file should contain this information as a group attribute, see getSampleData for an example.
pValCutOff
Maximum allowed p-Value for the fisher test on contingency matrix matrix(c(caseCounts, caseCoverage, controlCounts, controlCoverage), nrow=2).
minCoverage
Required coverage in both sample for a call to be made
mergeDels
Boolean flag specifying whether adjacent deletions should be merged
mergeAggregator
Which function to use for aggregating the values associated with adjacent deletions that are being merged

Value

  • The return value is a data.frame with the following slots:
  • ChromThe chromosome the potential variant is on
  • StartThe starting position of the variant
  • EndThe end position of the variant
  • SampleThe Case sample in which the variant was observed
  • refAlleleThe reference allele
  • altAlleleThe alternate allele
  • caseCountSupport for the variant in the Case sample
  • caseCoverageCoverage of the variant position in the Case sample
  • controlCountSupport for the variant in the Control sample
  • controlCoverageCoverage of the variant position in the Control sample
  • pValueThe p.value of the fisher.test

Details

data is a list which has to at least contain the Counts, Coverages and Reference datasets. This list will usually be generated by a call to the h5dapply function in which the tally file, chromosome, datasets and regions within the datasets would be specified. See h5dapply for specifics.

callVariantsPairedFisher implements a simple pairwise variant callign approach based on using the fisher.test on the following contingency matrix: rr{ caseSupport caseCoverage - caseSupport conttrolSupport controlCoverage - controlSupport } The results are filtered by pValCutOff and minCoverage.

Examples

Run this code
library(h5vc) # loading library
tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
position <- 29979629
windowsize <- 2000
vars <- h5dapply( # Calling Variants
  filename = tallyFile,
  group = "/ExampleStudy/16",
  blocksize = 1000,
  FUN = callVariantsPairedFisher,
  sampledata = sampleData,
  pValCutOff = 0.1,
  names = c("Coverages", "Counts", "Reference"),
  range = c(position - windowsize, position + windowsize),
  verbose = TRUE
)
vars <- do.call(rbind, vars)
vars

Run the code above in your browser using DataLab