subSeq (version 1.2.2)

generateSubsampledMatrix: Generate the read matrix corresponding to a particular level

Description

Generate a subsampled matrix from an original count matrix. This can be used to perform read subsampling analyses, (though generally the subsample function is recommended).

It is also useful for reproducing the results of an earlier run (see Details).

Usage

generateSubsampledMatrix(counts, proportion, seed, replication = 1)

Arguments

counts
Original matrix of read counts
proportion
The specific proportion to subsample
seed
A subsampling seed, which can be extracted from a subsamples or summary.subsamples object. If not given, doesn't set the seed.
replication
Replicate number: allows performing multiple deterministic replications at a given subsampling proportion

Value

  • subsamples matrix at specified subsampling proportion

Details

A subsamples object, or a summary.subsamples object, does not contain the subsampled count matrix at each depth (as it would take too much space and is rarely used). However, as it saves the random seed used to generate the count matrix, the count matrix at any depth can be retrieved. This can be done for a subsamples object ss by retrieving the seed with getSeed(ss). When given along with the original counts, the proportion, and the replication number (if more than one subsampling was done at each proportion) this produces the same matrix as was used in the analysis.

The seed is calculated deterministically using an md5 hash of three combined values: the global seed used for the subsampling object, the subsampling proportion, and the replication # for that proportion.

Examples

data(hammer)

hammer.counts = Biobase::exprs(hammer)[, 1:4]
hammer.design = Biobase::pData(hammer)[1:4, ]
hammer.counts = hammer.counts[rowSums(hammer.counts) >= 5, ]

ss = subsample(hammer.counts, c(.01, .1, 1), treatment=hammer.design$protocol,
                 method=c("edgeR", "DESeq2", "voomLimma"))

seed = getSeed(ss)

# generate the matrices used at each subsample
subm.01 = generateSubsampledMatrix(hammer.counts, .01, seed)
subm.1 = generateSubsampledMatrix(hammer.counts, .1, seed)