Estimate NB dispersion by modeling it as a parametric function of preliminarily estimated log mean relative frequencies.
estimate.dispersion(nb.data, x, model = "NBQ", method = "MAPL", ...)output from
prepare.nb.data.
a design matrix specifying the mean structure of each row.
the name of the dispersion model, one of "NB2", "NBP", "NBQ" (default), "NBS" or "step".
a character string specifying the method for estimating the dispersion model, one of "ML" or "MAPL" (default).
(for future use).
a list with following components:
dispersion estimates for each read count,
a matrix of the same dimensions as the counts matrix
in nb.data.
the likelihood of the fitted model.
details of the estimate dispersion model, NOT intended for use by end users. The name and contents of this component are subject to change in future versions.
We use a negative binomial (NB) distribution to model the read frequency of gene \(i\) in sample \(j\). A negative binomial (NB) distribution uses a dispersion parameter \(\phi_{ij}\) to model the extra-Poisson variation between biological replicates. Under the NB model, the mean-variance relationship of a single read count satisfies \(\sigma_{ij}^2 = \mu_{ij} + \phi_{ij} \mu_{ij}^2\). Due to the typically small sample sizes of RNA-Seq experiments, estimating the NB dispersion \(\phi_{ij}\) for each gene \(i\) separately is not reliable. One can pool information across genes and biological samples by modeling \(\phi_{ij}\) as a function of the mean frequencies and library sizes.
Under the NB2 model, the dispersion is a constant across all genes and samples.
Under the NBP model, the log dispersion is modeled as a
linear function of the preliminary estimates of the log
mean relative frequencies (pi.pre):
log(phi) = par[1] + par[2] * log(pi.pre/pi.offset),
where pi.offset is 1e-4.
Under the NBQ model, the dispersion is modeled as a quadratic function of the preliminary estimates of the log mean relative frequencies (pi.pre):
log(phi) = par[1] + par[2] * z + par[3] * z^2,
where z = log(pi.pre/pi.offset). By default, pi.offset is the median of pi.pre[subset,].
Under this NBS model, the dispersion is modeled as a smooth function (a natural cubic spline function) of the preliminary estimates of the log mean relative frequencies (pi.pre).
Under the "step" model, the dispersion is modeled as a step (piecewise constant) function.
# NOT RUN {
## See the example for test.coefficient.
# }
Run the code above in your browser using DataLab