# estimateCommonDisp: Estimate Common Negative Binomial Dispersion by Conditional Maximum Likelihood

## Description

Maximizes the negative binomial conditional common likelihood to estimate a common dispersion value across all genes.
## Usage

"estimateCommonDisp"(y, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
"estimateCommonDisp"(y, group=NULL, lib.size=NULL, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)

## Arguments

y

matrix of counts or a `DGEList`

object.

tol

the desired accuracy, passed to `optimize`

. rowsum.filter

genes with total count (across all samples) below this value will be filtered out before estimating the dispersion.

verbose

logical, if `TRUE`

then the estimated dispersion and BCV will be printed to standard output.

group

vector or factor giving the experimental group/condition for each library.

lib.size

numeric vector giving the total count (sequence depth) for each library.

...

other arguments that are not currently used.

## Value

`estimateCommonDisp.DGEList`

adds the following components to the input `DGEList`

object:
- common.dispersion
- estimate of the common dispersion.
- pseudo.counts
- numeric matrix of pseudo-counts.
- pseudo.lib.size
- the common library size to which the pseudo-counts have been adjusted.
- AveLogCPM
- numeric vector giving log2(AveCPM) for each row of
`y`

. `estimateCommonDisp.default`

returns a numeric scalar of the common dispersion estimate.

## Details

Implements the conditional maximum likelihood (CML) method proposed by Robinson and Smyth (2008) for estimating a common dispersion parameter.
This method proves to be accurate and nearly unbiased even for small counts and small numbers of replicates.The CML method involves computing a matrix of quantile-quantile normalized counts, called pseudo-counts.
The pseudo-counts are adjusted in such a way that the library sizes are equal for all samples, while preserving differences between groups and variability within each group.
The pseudo-counts are included in the output of the function, but are intended mainly for internal edgeR use.

## Examples

# True dispersion is 1/5=0.2
y <- matrix(rnbinom(250*4,mu=20,size=5),nrow=250,ncol=4)
dge <- DGEList(counts=y,group=c(1,1,2,2))
dge <- estimateCommonDisp(dge, verbose=TRUE)