estimateCommonDisp: Estimate Common Negative Binomial Dispersion by Conditional Maximum Likelihood
Maximizes the negative binomial conditional common likelihood to estimate a common dispersion value across all genes.
"estimateCommonDisp"(y, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
"estimateCommonDisp"(y, group=NULL, lib.size=NULL, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
matrix of counts or a
the desired accuracy, passed to
genes with total count (across all samples) below this value will be filtered out before estimating the dispersion.
TRUE then the estimated dispersion and BCV will be printed to standard output.
vector or factor giving the experimental group/condition for each library.
numeric vector giving the total count (sequence depth) for each library.
other arguments that are not currently used.
estimateCommonDisp.DGEList adds the following components to the input
- estimate of the common dispersion.
- numeric matrix of pseudo-counts.
- the common library size to which the pseudo-counts have been adjusted.
- numeric vector giving log2(AveCPM) for each row of
estimateCommonDisp.default returns a numeric scalar of the common dispersion estimate.
Implements the conditional maximum likelihood (CML) method proposed by Robinson and Smyth (2008) for estimating a common dispersion parameter.
This method proves to be accurate and nearly unbiased even for small counts and small numbers of replicates.
The CML method involves computing a matrix of quantile-quantile normalized counts, called pseudo-counts.
The pseudo-counts are adjusted in such a way that the library sizes are equal for all samples, while preserving differences between groups and variability within each group.
The pseudo-counts are included in the output of the function, but are intended mainly for internal edgeR use.
# True dispersion is 1/5=0.2
y <- matrix(rnbinom(250*4,mu=20,size=5),nrow=250,ncol=4)
dge <- DGEList(counts=y,group=c(1,1,2,2))
dge <- estimateCommonDisp(dge, verbose=TRUE)