Subsample reads and perform statistical testing on each sample
Perform subsampling at multiple proportions on a matrix of count data representing mapped reads across multiple samples in many genes. For each sample, perform some statistical operations.
subsample(counts, proportions, method = "edgeR", replications = 1, seed = NULL, qvalues = TRUE, env = parent.frame(), ...)
- Matrix of unnormalized counts
- Vector of subsampling proportions in (0, 1]
- One or more methods to be performed at each subsample, such as edgeR or DESeq (see Details)
- Number of replications to perform at each depth
- An initial seed, which will be stored in the output so that any individual simulation can be reproduced.
- Whether q-values should be calculated for multiple hypothesis test correction at each subsample.
- Environment in which to find evaluate additional hander functions that are given by name
- Other arguments given to the handler, such as
Method represents the name of a handler function, which can be custom-written by the user.
If a gene has a count of 0 at a particular depth, we set the p-value to 1 and the coefficient to 0 to stay consistent between programs. If the gene has a count that is not 0 but the p-value is NA, we set the p-value to 1 but keep the estimated coefficient.
- A subsample S3 object, which is a data.table containing
pvalue A p-value calculated for each gene by the handler coefficient An effect size (usually log fold change) calculated for each gene by the handler ID gene ID count the number of reads to this specific gene in this subsample depth the overall sequencing depth of this subsample method the method used (the name of the handler)
data(hammer) hammer.counts = Biobase::exprs(hammer)[, 1:4] hammer.design = Biobase::pData(hammer)[1:4, ] hammer.counts = hammer.counts[rowSums(hammer.counts) >= 5, ] ss = subsample(hammer.counts, c(.01, .1, 1), treatment=hammer.design$protocol, method=c("edgeR", "DESeq2", "voomLimma"))