callVariantsPaired( data, sampledata, cl = vcConfParams() )vcConfParams(
minStrandCov = 5,
maxStrandCov = 200,
minStrandAltSupport = 2,
maxStrandAltSupportControl = 0,
minStrandDelSupport = minStrandAltSupport,
maxStrandDelSupportControl = maxStrandAltSupportControl,
minStrandInsSupport = minStrandAltSupport,
maxStrandInsSupportControl = maxStrandAltSupportControl,
minStrandCovControl = 5,
maxStrandCovControl = 200,
bases = 5:8,
returnDataPoints = TRUE,
annotateWithBackground = TRUE,
mergeCalls = TRUE,
mergeAggregator = mean,
pValueAggregator = max
)
list
with elements Counts
(a 4d
integer
array of size [1:12, 1:2, 1:k, 1:n]),
Coverage
(a 3d integer
array of size [1:2, 1:k, 1:n]),
Deletions
(a 3d integer
array of size [1:2, 1:k, 1:n]),
Reference
(a 1d integer
vector of size [1:n]) -- see Details.data.frame
with k
rows (one for each
sample) and columns Type
, Column
and (SampleGroup
or Patient
). The tally file should contain this information as
a group attribute, see getSampleData
for an example.vcConfParams
.returnDataPoints == FALSE
only the variant positions are returned.returnDataPoints
== TRUE
callDeletionsPaired
).
Only usefull applied if returnDataPoints == TRUE
mean
, which means that a deletion larger than 1bp
will be annotated with the means of the counts and coveragesmax
. Is only
applied if annotateWithBackground == TRUE
data.frame
containing the calls themselves which might contain
annotations. Adjacent calls might be merged and calls might be
annotated with p-values depending on configuration parameters.
When the configuration parameter returnDataPoints
is FALSE
the functions return the positions of potential variants as a list containing one integer vector of positions for each sample, if no positions were found for a sample the list will contain NULL
instead. In the case of returnDatapoints == TRUE
the functions return either NULL
if no poisitions were found or a data.frame
with the following slots:Case
sample in which the variant was observed"-"
)Case
sample on the forward strandCase
sample on the reverse strandCase
sample on the forward strandCase
sample on the reverse strandControl
sample on the forward strandControl
sample on the reverse strandControl
sample on the forward strandControl
sample on the reverse strandannotateWithBackground
option is set the following extra columns are returnedControl
on the forward strandControl
on the reverse strandp.value
of the test binom.test( caseCountFwd, caseCoverageFwd, p = backgroundFrequencyFwd, alternative = "greater")
p.value
of the test binom.test( caseCountRev, caseCoverageRev, p = backgroundFrequencyRev, alternative = "greater")
callDeletionsPaired
merges adjacent single-base deletion calls if the option mergeCalls
is set to TRUE
, in that case the counts and coverages ( e.g. caseCountFwd
) are aggregated using the function supplied in the mergeAggregator
option of the configuration list (defaults to mean
) and the p-values pValueFwd
and pValueFwd
(if annotateWithBackground
is TRUE
), are aggregated using the function supplied in the pValueAggregator
option (defaults to max
).data
is a list of datasets which has to at least contain the
Counts
and Coverages
for variant calling respectively
Deletions
for deletion calling. This list will usually be
generated by a call to the h5dapply
function in which the tally
file, chromosome, datasets and regions within the datasets would be
specified. See ?h5dapply
for specifics. In order for callVariantsPaired
to return the correct locations of the variants there must be the h5dapplyInfo
slot present in data
as well. This is itself a list (being automatically added by
h5dapply
and h5readBlock
respectively) and contains the slots Group
(location in the HDF5 file) and Blockstart
, which are used to set the chromosome
and the genomic positions of variants. vcConfParams
is a helper function that builds a set of variant
calling parameters as a list. This list is provided to the calling
functions e.g. callVariantsPaired
and influences their behavior.
callVariantsPaired
implements a simple pairwise variant
callign approach applying the filters specified in cl
, and
might additionally computes an estimate of the background mismatch
rate (the mean mismatch rate of all samples labeled as 'Control' in
the sampledata
and annotate the calls with p-values for the
binom.test
of the observed mismatch counts and coverage at each
of the samples labeled as 'Case'.
library(h5vc) # loading library
tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
position <- 29979629
windowsize <- 1000
vars <- h5dapply( # Calling Variants
filename = tallyFile,
group = "/ExampleStudy/16",
blocksize = 500,
FUN = callVariantsPaired,
sampledata = sampleData,
cl = vcConfParams(returnDataPoints=TRUE),
names = c("Coverages", "Counts", "Reference", "Deletions"),
range = c(position - windowsize, position + windowsize)
)
vars <- do.call( rbind, vars ) # merge the results from all blocks by row
vars # We did find a variant
Run the code above in your browser using DataLab