ssizeRNA_single: Sample Size Calculations for Two-Sample RNA-seq Experiments with Single Set of Parameters

Description

This function calculates appropriate sample sizes for two-sample RNA-seq experiments for a desired power in which mean and dispersion parameters are identical for all genes. Sample size calculations are performed at controlled false discovery rates, user-specified proportions of non-differentially expressed genes, mean counts in control group, dispersion, and log fold change. A plot of power versus sample size is generated.

Usage

ssizeRNA_single(nGenes = 10000, pi0 = 0.8, m = 200, mu, disp, logfc, up = 0.5, replace = TRUE, fdr = 0.05, power = 0.8, maxN = 35, side = "two-sided", cex.title = 1.15, cex.legend = 1)

Arguments

nGenes

total number of genes, the default value is 10000.

pi0

proportion of non-differentially expressed genes, the default value is 0.8.

pseudo sample size for generated data.

a vector (or scalar) of mean counts in control group from which to simulate.

disp

a vector (or scalar) of dispersion parameter from which to simulate.

logfc

a vector (or scalar, or a function that takes an integer n and generates a vector of length n) of log fold change for differentially expressed (DE) genes.

proportion of up-regulated genes among all DE genes, the default value is 0.5.

replace

sample with or without replacement from given parameters. See Details for more information.

fdr

the false discovery rate to be controlled.

power

the desired power to be achieved.

maxN

the maximum sample size used for power calculations.

side

options are "two-sided", "upper", or "lower".

cex.title

controls size of chart titles.

cex.legend

controls size of chart legend.

Value

ssize: sample sizes (for each treatment) at which desired power is first reached.
power: power calculations with corresponding sample sizes.
crit.vals: critical value calculations with corresponding sample sizes.

Details

If a vector is input for pi0, sample size calculations are performed for each proportion.

If the total number of genes is larger than length of mu or disp, replace always equals TRUE.

References

Liu, P. and Hwang, J. T. G. (2007) Quick calculation for sample size while controlling false discovery rate with application to microarray analysis. Bioinformatics 23(6): 739-746.

Orr, M. and Liu, P. (2009) Sample size estimation while controlling false discovery rate for microarray experiments using ssize.fdr package. The R Journal, 1, 1, May 2009, 47-53.

Law, C. W., Chen, Y., Shi, W., Smyth, G. K. (2014). Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology 15, R29.

Examples

Run this code

mu <- 10                ## mean counts in control group for all genes
disp <- 0.1             ## dispersion for all genes
logfc <- log(2)         ## log fold change for DE genes

size <- ssizeRNA_single(m = 30, mu = mu, disp = disp, logfc = logfc, 
                        maxN = 20)
size$ssize              ## first sample size to reach desired power
size$power              ## calculated power for each sample size
size$crit.vals          ## calculated critical value for each sample size

Run the code above in your browser using DataLab