estimate.disp: Estimate Negative Binomial Dispersion Parameters

Description

Fit a NBP (or NB2) model to the RNA-Seq count data. Estimate negative binomial dispersion parameter(s).

Usage

estimate.disp(obj, method = "NBP", print.level = 1, ...)

Arguments

obj

output from prepare.nbp.

method

"NBP" (default) or "NB2", the model for the count variance.

print.level

controls the amount of messages printed: 0 for suppressing all messages, 1 for basic progress messages, larger values for more detailed messages.

...

additional parameters controlling the estimation of the parameters.

Value

The list obj from the input with the following added components:
phi, alphaparameters of the dispersion model.
piea matrix of the same dimensions as obj$counts, estimated mean relative frequencies.

Details

For each individual gene $i$, a negative binomial (NB) distribution uses a dispersion parameter $\phi_i$ to model the extra-Poisson variation between biological replicates: the NB model imposes a mean-variance relationship $\sigma_i^2 = \mu_i + \phi_i \mu_i^2$. Across all genes, the NBP parameterization of the NB distribution (the NBP model) uses two parameters $(\phi, \alpha)$ to model extra-Poisson variation over the entire range of expression levels. The NBP model allows the NB dispersion parameter to be an arbitrary power function of the mean ($\phi_i = \phi\mu_i^{2-\alpha}$). The NBP model includes the Poisson model as a limiting case (as $\phi$ tends to $0$) and the NB2 model as a special case (when $\alpha=2$). Under the NB2 model, the dispersion parameter is a constant and does not vary with the mean expression levels. NBP model is more flexible and is the recommended default option. The dispersion parameters are estimated from the pseudo counts (counts adjusted to have same effective library sizes). The parameters are estimated by maximizing the log conditional likelihood of $(\phi, \alpha)$ given the row sums. The log conditional likelihood is computed for each gene in each treatment group and then summed over genes and treatment groups.

References

Di Y, Schafer DW, Cumbie JS, and Chang JH (2011): "The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq", Statistical Applications in Genetics and Molecular Biology, 10 (1).

Examples

Run this code

## Load Arabidopsis data
  data(arab);

  ## Specify treatment groups 
  grp.ids = c(1, 1, 1, 2, 2, 2);

  ## Prepare an NBP object, adjust the library sizes by thinning the counts.
  set.seed(999);

  ## For demonstration purpose, we will use the first 100 rows of the data
  obj = prepare.nbp(arab[1:100,], grp.ids, print.level=5);

  ## Estimate the NBP dispersion parameters
  obj = estimate.disp(obj, print.level=5);
  
  ## Print the NBP object
  print.nbp(obj);

Run the code above in your browser using DataLab