PoisDev: Compute Poisson deviances for a given design matrix

Description

A function called within QL.fit to compute Poisson deviances of each window for a given design matrix.

Usage

PoisDev(counts, design, log.offset, print.progress = TRUE,
  bias.fold.tolerance = 1.1, chip.col.indicator)

Arguments

counts

A matrix of integer counts obtained by counting reads across a genomic partition. Each row contains observations from a single window of the genomic partition. Each column contains observations from a single sample (either ChIP or control/input).

design

A design matrix for the full model, including a column that indicates whether the observation is a ChIP sample (1) or control/input (0). The number of rows must be ncol(counts). Means are modeled with a log link function.

log.offset

A vector of log-scale, additive factors used to adjust estimated log-scale means for differences in library sizes across samples. Commonly used offsets include log.offset=log(colSums(counts)) and log.offset=log(apply(counts[rowSums(counts)!=0,],2,quantile,.75)). If NULL, the later offset is used.

print.progress

logical. If TRUE, the function will provide an update on which window (row number) is being analyzed. Updates occur frequently to start then eventually occur every 5000 windows.

bias.fold.tolerance

A numerical value no smaller than 1. If the bias reduction of maximum likelihood estimates of (log) fold change is likely to result in a ratio of fold changes greater than this value, then bias reduction will be performed on such windows. Setting bias.fold.tolerance=Inf will completely disable bias reduction; setting bias.fold.tolerance=1 will always perform bias reduction (see details). If the constrained estimate differs from the unconstrained estimate, bias reduction is not performed.

chip.col.indicator

A binary vector of length ncol(design.matrix) that indicates which column of the full design matrix corresponds to the ChIP indicator.

Value

A list containing:

dev

Vector containing the deviance for each window under a Poisson model fit to design matrix specified by design. This vector is passed along within the QL.fit function.

means

Matrix of fitted mean values for each window.

parms

Matrix of estimated coefficients for each window.

dev.constrained

Same as dev, subject to the constraint that the ChIP coefficient is non-negative. If fitting a reduced model with no ChIP coefficient, this will be NA.

means.constrained

Same as means, subject to the constraint that the ChIP coefficient is non-negative. If fitting a reduced model with no ChIP coefficient, this will be NA.

parms.constrained

Same as parms, subject to the constraint that the ChIP coefficient is non-negative. If fitting a reduced model with no ChIP coefficient, this will be NA.

Details

This functions fits, for each row of counts, a Poisson log-linear model.

Asymptotic biases of regression coefficients (i.e., log fold changes) are then estimated by a plug-in estimate [eqn. (15.4) of McCullagh and Nelder, 1989] from the last iteration of iteratively reweighted least squares (IWLS) procedure. The fitted response values are then compared with or without such a bias term. If the ratio of fitted response values are larger than bias.fold.tolerance for any observation and the unconstrained estimate equals the constrained estimate, then the bias-reduction (not bias-correction) procedure according to Firth (1993) and Kosmidis & Firth (2009) is applied to such rows of counts, by adjusting the score equation with a term based on the observed information. Such bias-reduced estimates are more advantageous than directly subtracting the estimated bias from the maximum likelihood estimates as the latter may not exist (e.g., when all counts in one treatment group are zeros).

When the ChIP coefficient is constrained to be non-negative, quadratic programming is applied during IWLS using solve.QP. Note that if the constrained estimate of the regression coefficient differs from the unconstrained estimate for a given window, bias reduction is not performed for that window.

References

Firth (1993) "Bias reduction of maximum likelihood estimates" Biometrika, 80, 27--38.

Kosmidis and Firth (2009) "Bias reduction in exponential family nonlinear models" Biometrika, 96, 793--804.

Lund, Nettleton, McCarthy and Smyth (2012) "Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates" emphSAGMB, 11(5).

McCullagh and Nelder (1989) "Generalized Linear Models", 2nd edition. London: Chapman and Hall.