BatchJobs
on high performance compute clusters (HPC)BatchJobs
for parallel computation on HPCsbatchTallyParam(
bamFiles,
destination,
group,
chrom, start, stop,
blocksize = 100000,
registryDir = tempdir(),
resources = list("queue" = "research-rh6", "memory"="4000", "ncpus"="4", walltime="90:00"),
q=25, ncycles = 0, max.depth=1000000,
reference = NULL,
sleep = 5
)batchTallies( confList = batchTallyParam() )
rerunBatchTallies( confList, tryCollect = TRUE )
collectTallies(blocks, confList, registries )
setSampleData
for details.NULL
a consensus vote will be used to estimate the reference at any given position, this means you cannot detect variants with AF >= 0.5 anymore -- especially when tallying more than one bamFile you really should specify thisprepareTallyFile
for details"/ExampleStudy/22"
BatchJobs
wil be held, this can be temporary since we delete them when we are doneBatchJobs
for detailsbatchTallyParam()
batchTallies
to verbosererunBatchTallies
function should try to collect data from the specified registries before re-submitting.data.frame
defining blocks to tally in, result of a cal to defineBlocks
tallyBAM
to a set of bam files specified in the bamFiles
argument. The order or samples along the sample dimension is the same as the order of the file names (i.e. the order of the bamfiles
argument). The function uses BatchJobs
to dispatch tallying in blocks along the genome to a HPC and collects the results and writes them into the HDF5 tally file specified in the destination
parameter.rerunBatchTallies can be used to re-submit failed blocks.
collectTallies can be used to manually collect tally data from the registries created by batchTallies
library(h5vc)
files <- c("NRAS.AML.bam","NRAS.Control.bam")
bamFiles <- file.path( system.file("extdata", package = "h5vcData"), files)
chrom = "1"
startpos <- 115247090
endpos <- 115259515
batchTallies( batchTallyParam(bamFiles, chrom, startpos, endpos) )
Run the code above in your browser using DataLab