Half-Normal Plots with Simulation Envelopes

Produces a (half-)normal plot from a fitted model object for a range of different models. Extendable to non-implemented model classes.

GLM, hnp
hnp(object, sim = 99, conf = 0.95, resid.type, maxit,
    halfnormal = T, scale = F, plot.sim = T, verb.sim = F,
    warn = F, how.many.out = F, print.on = F, paint.out = F,
    col.paint.out, newclass = F, diagfun, simfun, fitfun, ...)

fitted model object or numeric vector.


number of simulations used to compute envelope. Default is 99.


confidence level of the simulated envelope. Default is 0.95.


type of residuals to be used; must be one of "deviance", "pearson", "response", "working", "simple", "student", or "standard". Not all model type and residual type combinations are allowed. Defaults are "student" for aov and lm objects, "deviance" for glm, glm.nb, lmer, glmer and aodml objects, "simple" for gamlss objects, "response" for glmmadmb and vglm objects, "pearson" for zeroinfl and hurdle objects.


maximum number of iterations of the estimation algorithm. Defaults are 25 for glm, glm.nb, gamlss and vglm objects, 300 for glmmadmb, lmer and glmer objects, 3000 for aodml objects, 10000 for zeroinfl and hurdle objects.


logical. If TRUE, a half-normal plot is produced. If FALSE, a normal plot is produced. Default is TRUE.


logical. If TRUE and if object is a numeric vector, simulates from a normal distribution with mean and variance estimated from object. If FALSE, uses a standard normal distribution to simulate from. Default is FALSE.


logical. Should the (half-)normal plot be plotted? Default is TRUE.


logical. If TRUE, prints each step of the simulation procedure. Default is FALSE.


logical. If TRUE, shows warning messages in the simulation process. Default is FALSE.


logical. If TRUE, the number of points out of the envelope is printed. Default is FALSE.


logical. If TRUE, the number of points out of the envelope is printed on the plot. Default is FALSE.


logical. If TRUE, points out of the simulation envelope are plotted in a different color. Default is FALSE.


If paint.out=TRUE, sets the color of points out of the envelope. Default is "red".


logical. If TRUE, use diagfun, simfun, and fitfun to extract diagnostics (typically residuals), generate simulated data using fitted model parameters, and fit the desired model. Default is FALSE.


user-defined function used to obtain the diagnostic measures from the fitted model object (only used when newclass=TRUE). Default is resid.


user-defined function used to simulate a random sample from the model estimated parameters (only used when newclass=TRUE).


user-defined function used to re-fit the model to simulated data (only used when newclass=TRUE).

extra graphical arguments passed to plot.hnp.


A relatively easy way to assess goodness-of-fit of a fitted model is to use (half-)normal plots of a model diagnostic, e.g., different types of residuals, Cook's distance, leverage. These plots are obtained by plotting the ordered absolute values of a model diagnostic versus the expected order statistic of a half-normal distribution, $$\Phi^{-1}(\frac{i+n-1/8}{2*n+1/2})$$ (for a half-normal plot) or the normal distribution, $$\Phi^{-1}(\frac{i+3/8}{n+1/4})$$ (for a normal plot).

Atkinson (1985) proposed the addition of a simulated envelope, which is such that under the correct model the plot for the observed data is likely to fall within the envelope. The objective is not to provide a region of acceptance, but some sort of guidance to what kind of shape to expect.

Obtaining the simulated envelope is simple and consists of (1) fitting a model; (2) extracting model diagnostics and calculating sorted absolute values; (3) simulating 99 (or more) response variables using the same model matrix, error distribution and fitted parameters; (4) fitting the same model to each simulated response variable and obtaining the same model diagnostics, again sorted absolute values; (5) computing the desired percentiles (e.g., 2.5 and 97.5) at each value of the expected order statistic to form the envelope.

This function handles different model classes and more will be implemented as time goes by. So far, the following models are included:

Continuous data:
Normal: functions lm, aov and glm with family=gaussian
function glm with family=Gamma
Inverse gaussian: function glm with family=inverse.gaussian
Proportion data:
function glm with family=binomial
Quasi-binomial: function glm with family=quasibinomial
package VGAM - function vglm, with family=betabinomial;
package aods3 - function aodml, with family="bb";
package gamlss - function gamlss, with family=BB;
package glmmADMB - function glmmadmb, with family="betabinomial"
Zero-inflated binomial: package VGAM - function vglm, with family=zibinomial;
package gamlss - function gamlss, with family=ZIBI;
package glmmADMB - function glmmadmb, with family="binomial"
and zeroInfl=TRUE
Zero-inflated beta-binomial:
package gamlss - function gamlss, with family=ZIBB;
package glmmADMB - function glmmadmb, with family="betabinomial"
and zeroInfl=TRUE
Multinomial: package nnet - function multinom

Count data: Poisson: function glm with family=poisson Quasi-Poisson: function glm with family=quasipoisson Negative binomial: package MASS - function glm.nb; package aods3 - function aodml, with family="nb" and phi.scale="inverse" Zero-inflated Poisson: package pscl - function zeroinfl, with dist="poisson" Zero-inflated negative binomial: package pscl - function zeroinfl, with dist="negbin" Hurdle Poisson: package pscl - function hurdle, with dist="poisson" Hurdle negative binomial: package pscl - function hurdle, with dist="negbin" Mixed models:

Linear mixed models: package lme4, function lmer Generalized linear mixed models: package lme4, function glmer with family=poisson or binomial

Users can also use a numeric vector as object and hnp will generate the (half-)normal plot with a simulated envelope using the standard normal distribution (scale=F) or \(N(\mu, \sigma^2)\) (scale=T).

Implementing a new model class is done by providing three functions to hnp: diagfun - to obtain model diagnostics, simfun - to simulate random variables and fitfun - to refit the model to simulated variables. The way these functions must be written is shown in the Examples section.


hnp returns an object of class "hnp", which is a list containing the following components:


quantiles of the (half-)normal distribution


lower envelope band


median envelope band


upper envelope band


diagnostic measures in absolute value and in order


vector indicating which points are out of the envelope


color of points which are outside of the envelope (used if paint.out=TRUE)


logical. Equals TRUE if how.many.out=TRUE in the hnp call


length of the diagnostic measure vector


number of points out of the envelope


logical. Equals TRUE if print.on=TRUE in the hnp call


logical. Equals TRUE if paint.out=TRUE in the hnp call


matrix with all diagnostics obtained in the simulations. Each column represents one simulation


See documentation on example data sets for simple analyses and goodness-of-fit checking using hnp.


Moral, R. A., Hinde, J. and Dem<U+00E9>trio, C. G. B. (2017) Half-normal plots and overdispersed models in R: the hnp package. Journal of Statistical Software 81(10):1-23.

Atkinson, A. C. (1985) Plots, transformations and regression, Clarendon Press, Oxford.

Dem<U+00E9>trio, C. G. B. and Hinde, J. (1997) Half-normal plots and overdispersion. GLIM Newsletter 27:19-26.

Hinde, J. and Dem<U+00E9>trio, C. G. B. (1998) Overdispersion: models and estimation. Computational Statistics and Data Analysis 27:151-170.

Dem<U+00E9>trio, C. G. B., Hinde, J. and Moral, R. A. (2014) Models for overdispersed data in entomology. In Godoy, W. A. C. and Ferreira, C. P. (Eds.) Ecological modelling applied to entomology. Springer.

See Also

plot.hnp, cbb, chryso, corn, fungi, oil, progeny, wolbachia

  • hnp
## Simple Poisson regression
counts <- c(rpois(5, 2), rpois(5, 4), rpois(5, 6), rpois(5, 8))
treatment <- gl(4, 5)
fit <- glm(counts ~ treatment, family=poisson)
anova(fit, test="Chisq")

## half-normal plot

## or save it in an object and then use the plot method
my.hnp <- hnp(fit, print.on=TRUE, plot=FALSE)

## changing graphical parameters
plot(my.hnp, lty=2, pch=4, cex=1.2)
plot(my.hnp, lty=c(2,3,2), pch=4, cex=1.2, col=c(2,2,2,1))
plot(my.hnp, main="Half-normal plot", xlab="Half-normal scores",
     ylab="Deviance residuals", legpos="bottomright")

## Using a numeric vector
my.vec <- rnorm(20, 4, 4)
hnp(my.vec) # using N(0,1)
hnp(my.vec, scale=TRUE) # using N(mu, sigma^2)

## Implementing new classes
## Users provide three functions - diagfun, simfun and fitfun,
## in the following way:
## diagfun <- function(obj) {
##   userfunction(obj, other_argumens)
##     # e.g., resid(obj, type="pearson")
##   }
## simfun <- function(n, obj) {
##   userfunction(n, other_arguments) # e.g., rpois(n, fitted(obj))
##   }
## fitfun <- function(y.) {
##  userfunction(y. ~ linear_predictor, other_arguments, data=data)
##    # e.g., glm(y. ~ block + factor1 * factor2, family=poisson,
##    #           data=mydata)
##  }
## when response is binary:
## fitfun <- function(y.) {
##  userfunction(cbind(y., m-y.) ~ linear_predictor,
##               other_arguments, data=data)
##    #e.g., glm(cbind(y., m-y.) ~ treatment - 1,
##    #          family=binomial, data=data)
##  }

# }
## Example no. 1: Using Cook's distance as a diagnostic measure
y <- rpois(30, lambda=rep(c(.5, 1.5, 5), each=10))
tr <- gl(3, 10)
fit1 <- glm(y ~ tr, family=poisson)

# diagfun <- function(obj) cooks.distance(obj)

# simfun <- function(n, obj) {
  lam <- fitted(obj)
  rpois(n, lambda=lam)

# fitfun <- data.frame(y, tr) <- function(y.) glm(y. ~ tr, family=poisson,

# hnp call
hnp(fit1, newclass=TRUE,,,

## Example no. 2: Implementing gamma model using package gamlss
# load package

# model fitting
y <- rGA(30, mu=rep(c(.5, 1.5, 5), each=10), sigma=.5)
tr <- gl(3, 10)
fit2 <- gamlss(y ~ tr, family=GA)

# diagfun <- function(obj) resid(obj) # this is the default if no
                                  # diagfun is provided

# simfun <- function(n, obj) {
  mu <- obj$mu.fv
  sig <- obj$sigma.fv
  rGA(n, mu=mu, sigma=sig)

# fitfun <- data.frame(y, tr) <- function(y.) gamlss(y. ~ tr, family=GA,

# hnp call
hnp(fit2, newclass=TRUE,,,, data=data.frame(y, tr))

## Example no. 3: Implementing binomial model in gamlss
# model fitting
y <- rBI(30, bd=50, mu=rep(c(.2, .5, .9), each=10))
m <- 50
tr <- gl(3, 10)
fit3 <- gamlss(cbind(y, m-y) ~ tr, family=BI)

# diagfun <- function(obj) resid(obj)

# simfun <- function(n, obj) {
  mu <- obj$mu.fv
  bd <- obj$bd
  rBI(n, bd=bd, mu=mu)

# fitfun <- data.frame(y, tr, m) <- function(y.) gamlss(cbind(y., m-y.) ~ tr,

# hnp call
hnp(fit3, newclass=TRUE,,,
# }
Documentation reproduced from package hnp, version 1.2-6, License: GPL (>= 2)

Community examples

Looks like there are no examples yet.