estimate.dispersion: Estimate Negative Binomial Dispersion

Description

Estimate NB dispersion by modeling it as a parametric function of preliminarily estimated log mean relative frequencies.

Usage

estimate.dispersion(nb.data, x, model = "NBQ", method = "MAPL", ...)

Arguments

nb.data

output from prepare.nb.data.

a design matrix specifying the mean structure of each row.

model

the name of the dispersion model, one of "NB2", "NBP", "NBQ" (default), "NBS" or "step".

method

a character string specifying the method for estimating the dispersion model, one of "ML" or "MAPL" (default).

...

(for future use).

Value

a list with following components:

estimates

dispersion estimates for each read count, a matrix of the same dimensions as the counts matrix in nb.data.

likelihood

the likelihood of the fitted model.

model

details of the estimate dispersion model, NOT intended for use by end users. The name and contents of this component are subject to change in future versions.

Details

We use a negative binomial (NB) distribution to model the read frequency of gene \(i\) in sample \(j\). A negative binomial (NB) distribution uses a dispersion parameter \(\phi_{ij}\) to model the extra-Poisson variation between biological replicates. Under the NB model, the mean-variance relationship of a single read count satisfies \(\sigma_{ij}^2 = \mu_{ij} + \phi_{ij} \mu_{ij}^2\). Due to the typically small sample sizes of RNA-Seq experiments, estimating the NB dispersion \(\phi_{ij}\) for each gene \(i\) separately is not reliable. One can pool information across genes and biological samples by modeling \(\phi_{ij}\) as a function of the mean frequencies and library sizes.

Under the NB2 model, the dispersion is a constant across all genes and samples.

Under the NBP model, the log dispersion is modeled as a linear function of the preliminary estimates of the log mean relative frequencies (pi.pre):

log(phi) = par[1] + par[2] * log(pi.pre/pi.offset),

where pi.offset is 1e-4.

Under the NBQ model, the dispersion is modeled as a quadratic function of the preliminary estimates of the log mean relative frequencies (pi.pre):

log(phi) = par[1] + par[2] * z + par[3] * z^2,

where z = log(pi.pre/pi.offset). By default, pi.offset is the median of pi.pre[subset,].

Under this NBS model, the dispersion is modeled as a smooth function (a natural cubic spline function) of the preliminary estimates of the log mean relative frequencies (pi.pre).

Under the "step" model, the dispersion is modeled as a step (piecewise constant) function.

Examples

Run this code

# NOT RUN {
## See the example for test.coefficient.
# }

Run the code above in your browser using DataLab