Learn R Programming

h5vc (version 2.6.3)

applyTallies: Preparing the results of tallyBAM for writing to an HDF5 tally file

Description

This function tallies a set of bam files and prepares the data for writing to an HDF5 tally file.

Usage

applyTallies( bamfiles, chrom, start, stop, q=25, ncycles = 0, max.depth=1000000, prepForHDF5 = TRUE, reference = NULL)

Arguments

bamfiles
A character vector of filenames of the bam files that should be tallies. Note that for writing to an HDF5 file the order of this vector must match the order of the Column field in the sampledata object that corresponds to the dataset - see setSampleData for details.
prepForHDF5
Boolean flag to specify whether the data shall be structured for compatibility with the HDF5 tally file format. See the details section of this manual page.
reference
A DNAString object containing the reference sequence corresponding to the region that is described in the counts array -- if this is NULL a consensus vote will be used to estimate the reference at any given position, this means you cannot detect variants with AF >= 0.5 anymore
chrom
Chromosome in which to tally
start
First position of the tally
stop
Last position of the tally
q
quality cut-off for considering a base call
ncycles
number of sequencing cycles form the front and back of the read that should be considered unreliable - used for stratifying the nucleotide counts
max.depth
only tally a position if there are less than this many reads overlapping it - can prevent long runtimes in unreliable regions

Value

  • A list with slots containing the Counts,Coverages,Deletions and Reference datasets for the given sample if prepForHDF5 is true, a list of 3D-arrays (Nucleotide x Strand x Position) otherwise.

Details

This is a wrapper function for applying tallyBAM to a set of bam files specified in the bamfiles argument. If prepForHDF5 is not true the result is equivalent to calling tallyBAM with lapply on the file names, otherwise the resulting data structure has the same layout as the return value of h5readBlock and can be written to an HDF5 tally file directly. The order or samples along the sample dimension is the same as the order of the file names (i.e. the order of the bamfiles argument).

Examples

Run this code
library(h5vc)
library(BSgenome.Hsapiens.UCSC.hg19)
files <- c("NRAS.AML.bam","NRAS.Control.bam")
bamFiles <- file.path( system.file("extdata", package = "h5vcData"), files)
chrom = "1"
startpos <- 115247090
endpos <- 115259515
theData <- applyTallies( bamFiles, reference = Hsapiens[["chr1"]][startpos:endpos], chr = chrom, start = startpos, stop = endpos, ncycles = 10 )
str(theData)

Run the code above in your browser using DataLab