BatchJobs on high performance compute clusters (HPC)BatchJobs for parallel computation on HPCsbatchTallyParam(
bamFiles,
destination,
group,
chrom, start, stop,
blocksize = 100000,
registryDir = tempdir(),
resources = list("queue" = "research-rh6", "memory"="4000", "ncpus"="4", walltime="90:00"),
q=25, ncycles = 0, max.depth=1000000,
reference = NULL,
sleep = 5
)batchTallies( confList = batchTallyParam() )
rerunBatchTallies( confList, tryCollect = TRUE )
collectTallies(blocks, confList, registries )
setSampleData for details.NULL a consensus vote will be used to estimate the reference at any given position, this means you cannot detect variants with AF >= 0.5 anymore -- especially when tallying more than one bamFile you really should specify thisprepareTallyFile for details"/ExampleStudy/22"BatchJobs wil be held, this can be temporary since we delete them when we are doneBatchJobs for detailsbatchTallyParam()batchTallies to verbosererunBatchTallies function should try to collect data from the specified registries before re-submitting.data.frame defining blocks to tally in, result of a cal to defineBlockstallyBAM to a set of bam files specified in the bamFiles argument. The order or samples along the sample dimension is the same as the order of the file names (i.e. the order of the bamfiles argument). The function uses BatchJobs to dispatch tallying in blocks along the genome to a HPC and collects the results and writes them into the HDF5 tally file specified in the destination parameter.rerunBatchTallies can be used to re-submit failed blocks.
collectTallies can be used to manually collect tally data from the registries created by batchTallies
library(h5vc)
files <- c("NRAS.AML.bam","NRAS.Control.bam")
bamFiles <- file.path( system.file("extdata", package = "h5vcData"), files)
chrom = "1"
startpos <- 115247090
endpos <- 115259515
batchTallies( batchTallyParam(bamFiles, chrom, startpos, endpos) )Run the code above in your browser using DataLab