Generate the read matrix corresponding to a particular level
Generate a subsampled matrix from an original count matrix. This can be used
to perform read subsampling analyses, (though generally the
function is recommended).
It is also useful for reproducing the results of an earlier run (see Details).
generateSubsampledMatrix(counts, proportion, seed, replication = 1)
- Original matrix of read counts
- The specific proportion to subsample
- A subsampling seed, which can be extracted from a subsamples or summary.subsamples object. If not given, doesn't set the seed.
- Replicate number: allows performing multiple deterministic replications at a given subsampling proportion
A subsamples object, or a summary.subsamples object, does not contain the
subsampled count matrix at each depth (as it would take too much space and
is rarely used). However, as it saves the random seed used to generate the
count matrix, the count matrix at any depth can be retrieved. This can be
done for a subsamples object
ss by retrieving the seed with
getSeed(ss). When given along with the original counts, the
proportion, and the replication number (if more than one subsampling was done
at each proportion) this produces the same matrix as was used in the analysis.
The seed is calculated deterministically using an md5 hash of three combined values: the global seed used for the subsampling object, the subsampling proportion, and the replication # for that proportion.
- subsamples matrix at specified subsampling proportion
data(hammer) hammer.counts = Biobase::exprs(hammer)[, 1:4] hammer.design = Biobase::pData(hammer)[1:4, ] hammer.counts = hammer.counts[rowSums(hammer.counts) >= 5, ] ss = subsample(hammer.counts, c(.01, .1, 1), treatment=hammer.design$protocol, method=c("edgeR", "DESeq2", "voomLimma")) seed = getSeed(ss) # generate the matrices used at each subsample subm.01 = generateSubsampledMatrix(hammer.counts, .01, seed) subm.1 = generateSubsampledMatrix(hammer.counts, .1, seed)