expectedDeviance: Expected Value of Scaled Unit Deviance for Linear Exponential Families

Description

Expected value and variance of the scaled unit deviance for common generalized linear model families.

Usage

expectedDeviance(mu, family="binomial", binom.size, nbinom.size, gamma.shape)

Value

A list with the components

mean: expected values
variance: variances

both of which have the same length and dimensions as the input mu.

Arguments

mu: numeric vector or matrix giving mean of response variable.
family: character string indicating the linear exponential family. Possible values are "binomial","gaussian", "Gamma", "inverse.gaussian", "poisson" or "negative.binomial".
binom.size: integer vector giving the number of binomial trials when family = "binomial". Equivalent to the "size" argument of pbinom.
nbinom.size: numeric vector giving the negative binomial size parameter when family = "negative.binomial", such that the variance of the response variable is mu + mu^2 / nbinom.size. Equivalent to the "size" parameter of pnbinom.
gamma.shape: numeric vector giving the gamma shape parameter when family = "Gamma", such that the variance of the response variable is mu^2 / gamma.shape. Equivalent to the "shape" parameter of pgamma.

Author

Lizhong Chen and Gordon Smyth

Details

For a generalized linear model (GLM), the scaled unit deviances can be computed using d <- f$dev.resids(y, mu, wt=1/phi) where f is the GLM family object, y is the response variable, mu is the vector of means and phi is the vector of GLM dispersions (incorporating any prior weights).

The scaled unit deviances are often treated as being chiquare distributed on 1 df, so the mean should be 1 and the variance should be 2. This distribution result only holds however when the saddlepoint approximation is accurate for the response variable distribution (Dunn and Smyth, 2018). In other cases, the expected value and variance of the unit deviances can be far from the nominal values. The expectedDeviance function returns the exact mean and variance of the unit deviance for the usual GLM familes assuming that mu is the true mean and phi is the true dispersion.

When family is "poisson", "binomial" or "negative.binomial", the expected values and variances are computed using Chebyshev polynomial approximations. When family = "Gamma", the function uses exact formulas derived by Smyth (1989).

Exact calculation of expected values and variances of negative binomial unit deviances have been used to construct adjusted residual deviances and adjusted residual degrees of freedom when fitting negative binomial generalized linear models in the edgeR software package (Baldoni et al 2024; Baldoni et al 2025; Chen et al, 2025).

References

Baldoni PL, Chen L, Smyth GK (2024). Faster and more accurate assessment of differential transcript expression with Gibbs sampling and edgeR v4. NAR Genomics and Bioinformatics 6(4), lqae151. tools:::Rd_expr_doi("10.1093/nargab/lqae151").

Baldoni PL, Chen L, Li M, Chen Y, Smyth GK (2025). Dividing out quantification uncertainty enables assessment of differential transcript usage with limma and edgeR. bioRxiv tools:::Rd_expr_doi("10.1101/2025.04.07.647659").

Chen Y, Chen L, Lun ATL, Baldoni PL, Smyth GK (2025). edgeR v4: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. Nucleic Acids Research 53(2), gkaf018. tools:::Rd_expr_doi("10.1093/nar/gkaf018")

Dunn PK, Smyth GK (2018). Generalized linear models with examples in R. Springer, New York, NY. tools:::Rd_expr_doi("10.1007/978-1-4419-0118-7")

Smyth, G. K. (1989). Generalized linear models with varying dispersion. J. R. Statist. Soc. B, 51, 47-61. tools:::Rd_expr_doi("10.1111/j.2517-6161.1989.tb01747.x")

Examples

Run this code

# Poisson example
lambda <- 3
nsim <- 1e4
y <- rpois(nsim, lambda=lambda)
d <- poisson()$dev.resids(y=y, mu=rep(lambda,nsim), wt=1)
c(mean=mean(d), variance=var(d))
unlist(expectedDeviance(mu=lambda, family="poisson"))

# binomial example
n <- 10
p <- 0.01
y <- rbinom(nsim, prob=p, size=n)
d <- binomial()$dev.resids(y=y/n, mu=rep(p,nsim), wt=n)
c(mean=mean(d), variance=var(d))
unlist(expectedDeviance(mu=p, family="binomial", binom.size=n))

# gamma example
alpha <- 5
beta <- 2
y <- beta * rgamma(1e4, shape=alpha)
d <- Gamma()$dev.resids(y=y, mu=rep(alpha*beta,n), wt=alpha)
c(mean=mean(d), variance=var(d))
unlist(expectedDeviance(mu=alpha*beta, family="Gamma", gamma.shape=alpha))

# negative binomial example
library(MASS)
mu <- 10
phi <- 0.2
y <- rnbinom(nsim, mu=mu, size=1/phi)
f <- MASS::negative.binomial(theta=1/phi)
d <- f$dev.resids(y=y, mu=rep(mu,nsim), wt=1)
c(mean=mean(d), variance=var(d))
unlist(expectedDeviance(mu=mu, family="negative.binomial", nbinom.size=1/phi))

# binomial expected deviance tends to zero for p small:
p <- seq(from=0.001,to=0.11,len=200)
ed <- expectedDeviance(mu=p,family="binomial",binom.size=10)
plot(p,ed$mean,type="l")

Run the code above in your browser using DataLab