Usage
tallyRanges(bamfiles, ranges, reference, q = 25, ncycles = 10, max.depth = 1e+06)
tallyRangesToFile(tallyFile, study, bamfiles, ranges, reference, samples = NULL, q = 25, ncycles = 0, max.depth=1e6)
tallyRangesBatch(tallyFile, study, bamfiles, ranges, reference, q = 25, ncycles = 10, max.depth=1e6, regID = "Tally", res = list("ncpus" = 2, "memory" = 24000, "queue"="research-rh6"), written = c(), wrfile = "written.jobs.RDa", waitTime = Inf)
Arguments
bamfiles
Character vector giving the locations of the bam files to be tallied
ranges
A GRanges object describing the ranges that tallies shalle be generated in, e.g. the result of a call to binGenome
or a set of exon or gene annotations provided by a TxDB
object. reference
BSgenome
object describing the reference genome that the alignments were made against.
samples
The indices (within the HDF5 datasets) corresponding to the samples that the data represents. You can use this option to write sub-sets of samples from a cohort.
q
Read alignment quality cut-off.
ncycles
Number of cycles from the front and back of the reads that should be considered unreliable for mismatch detection
max.depth
Maximum depth of coverage to consider
tallyFile
Filename of the HDF5 tally file that the data shall be written to
study
The location within the HDF5 file that corresponds to the HDF5-group representing the study we are working on.
regID
Identifier for a BatchJobs
registry which will be used to store and organise the cluster jobs used for parallelisation of the work. res
Resource list specifying the compute resources to be requested for each of the cluster jobs.
written
Numerical vector indicating the Job IDs of jobs whose results have already been written to the tally file, this can be used to resume writing after a crash.
wrfile
Filename for a file to store the IDs of already written jobs in, can be used to resume writing after a crash.
waitTime
How long shall the function wait on cluster jobst to finish, before giving up. Default is wait forever.