subsample: Subsample reads and perform statistical testing on each sample

Description

Perform subsampling at multiple proportions on a matrix of count data representing mapped reads across multiple samples in many genes. For each sample, perform some statistical operations.

Usage

subsample(counts, proportions, method = "edgeR", replications = 1,
  seed = NULL, qvalues = TRUE, env = parent.frame(), ...)

Arguments

counts

Matrix of unnormalized counts

proportions

Vector of subsampling proportions in (0, 1]

method

One or more methods to be performed at each subsample, such as edgeR or DESeq (see Details)

replications

Number of replications to perform at each depth

seed

An initial seed, which will be stored in the output so that any individual simulation can be reproduced.

qvalues

Whether q-values should be calculated for multiple hypothesis test correction at each subsample.

env

Environment in which to find evaluate additional hander functions that are given by name

...

Other arguments given to the handler, such as treatment

Value

A subsample S3 object, which is a data.table containing
pvalueA p-value calculated for each gene by the handler
coefficientAn effect size (usually log fold change) calculated for each gene by the handler
IDgene ID
countthe number of reads to this specific gene in this subsample
depththe overall sequencing depth of this subsample
methodthe method used (the name of the handler)

Details

Method represents the name of a handler function, which can be custom-written by the user.

If a gene has a count of 0 at a particular depth, we set the p-value to 1 and the coefficient to 0 to stay consistent between programs. If the gene has a count that is not 0 but the p-value is NA, we set the p-value to 1 but keep the estimated coefficient.

Examples

Run this code

data(hammer)

hammer.counts = Biobase::exprs(hammer)[, 1:4]
hammer.design = Biobase::pData(hammer)[1:4, ]
hammer.counts = hammer.counts[rowSums(hammer.counts) >= 5, ]

ss = subsample(hammer.counts, c(.01, .1, 1), treatment=hammer.design$protocol,
                 method=c("edgeR", "DESeq2", "voomLimma"))

Run the code above in your browser using DataLab