Residuals: Residuals from a binomial regression model

Description

Function to extract residuals from a binomial regression model

Usage

Residuals(object, type = c("approx.deletion", "exact.deletion",
          "standard.deviance", "standard.pearson", "deviance",
          "pearson", "working", "response", "partial"))

Arguments

object

An object of class glm with a binomial family

type

The type of residuals to be returned. Default is approx.deletion residuals

Value

A vector of residuals

Details

A considerable terminology inconsistency regarding residuals is found in the litterature, especially concerning the adjectives standardized and studentized. Here, we use the term standardized about residuals divided by $\sqrt(1-h_i)$ and avoid the term studentized in favour of deletion to avoid confusion. See Hardin and Hilbe (2007) p. 52 for a short discussion of this topic. The objective of Residuals is to enhance transparency of residuals of binomial regression models in Rand to uniformise the terminology. With the exception of exact.deletion all residuals are extracted with a call to rstudent, rstandard and residuals from the stats package (see the description of the individual residuals below).

response: response residuals$$y_i - \hat{y}_i$$The response residuals are also called raw residuals The residuals are extracted with a call toresiduals.

pearson: Pearson residuals $$X_i = \frac{y_i - n_i \hat{p}_i}{\sqrt{n_i\hat{p}_i(1-\hat{p}_i)}}$$ The residuals are extracted with a call to residuals. standard.pearson: standardized Pearson residuals $$r_{P,i} = \frac{X_i}{\sqrt{1-h_i}} = \frac{y_i+n_i\hat{p}_i}{\sqrt{n_i\hat{p}_i(1-\hat{p}_i)(1-h_i)}}$$ where $X_i$ are the Pearson residuals and $h_i$ are the hatvalues obtainable with hatvalues. The standardized Pearson residuals have many names including studentized Pearson residuals, standardized residuals, studentized residuals, internally studentized residuals. The residuals are extracted with a call to rstandard. deviance: deviance residual The deviance residuals are the signed square roots of the individual observations to the overall deviance $$d_i = sgn(y_i-\hat{y}_i) \sqrt{2 y_i \log\left( \frac{y_i}{\hat{y}_i}\right) + 2(n_i-y_i) \log\left( \frac{n_i-y_i}{n_i-\hat{y}_i}\right)}$$ The residuals are extracted with a call to residuals. standard.deviance: standardized deviance residuals $$r_{D,i} = \frac{d_i}{\sqrt{1-h_i}}$$ where $d_i$ are the deviance residuals and $h_i$ are the hatvalues that can be obtained with hatvalues. The standardized deviance residuals are also called studentized deviance residuals. The residuals are extracted with a call to rstandard. approx.deletion: approximate deletion residuals $$sgn(y_i-\hat{y}_i)\sqrt{h_i r^2_{P,i}+(1-h_i)r^2_{D,i}}$$ where $r_{P,i}$ are the standardized Pearson residuals, $r_{D,i}$ are the standardized deviance residuals and $h_i$ are the hatvalues that is obtained with hatvalues The approximate deletion residuals are approximations to the exact deletion residuals (see below) as suggested by Williams (1987). The approximate deletion residuals are called many different names in the litterature including likelihood residuals, studentized residuals, externally studentized residuals, deleted studentized residuals and jack-knife residuals. The residuals are extracted with a call to rstudent. exact.deletion: exact deletion residuals The $i$th deletion residual is calculated subtracting the deviances when fitting a linear logistic model to the full set of $n$ observations and fitting the same model to a set of $n-1$ observations excluding the $i$th observation, for $i = 1,...,n$. This gives rise to $n+1$ fitting processes and may be computationally heavy for large data sets. working: working residuals The difference between the working response and the linear predictor at convergence $$r_{W,i} = (y_i - \hat{y}_i)\frac{\partial\hat{\eta}_i}{\partial\hat{\mu}_i}$$ The residuals are extracted with a call to residuals. partial: partial residuals $$r_{W,i} + x_{ij} \hat{\beta}_j$$ where $j = 1,...,p$ and $p$ is the number of predictors. $x_{ij}$ is the $i$th observation of the $j$th predictor and $\hat{\beta}_j$ is the $j$th fitted coefficient. The residuals are useful for making partial residuals plots. They are extracted with a call to residuals

References

Collett, D. (2003) Modelling binary data. Second edition. Chapman & Hall/CRC. Fox, J. (2002) An R and S-Plus Companion to Applied Regression. Sage Publ. Hardin, J.W., Hilbe, J.M. (2007). Generalized Linear Models and Extensions. Second edition. Stata Press. Williams, D. A. (1987) Generalized linear model diagnostics using the deviance and single case deletions. Applied Statistics 36, 181-191.

Examples

Run this code

data(serum)
serum.glm <- glm(cbind(y, n-y) ~ log(dose), family = binomial, data = serum)
Residuals(serum.glm, type='standard.deviance')

Run the code above in your browser using DataLab