Deletion Diagnostics: Deletion Diagnostics for Linear and Generalized Linear Models

Description

These functions calculate a variety of leave-one-out deletion diagnostics for linear and generalized linear models, including studentized residuals (for outlier detection), hatvalues (for detecting high-leverage observations), and Cook's distances, dfbeta, and dfbetas (for detecting influential observations).

Usage

rstudent(model, ...)

rstudent.lm(model, infl=influence(model), names=infl$names)

rstudent.glm(model, infl=influence(model), names=infl$names)

hatvalues(model, ...)

hatvalues.lm(model, infl=influence(model), names=infl$names)

cookd(model, ...)

cookd.lm(model, infl=influence(model), sumry=summary(model), names=infl$names)

cookd.glm(model, infl=influence(model), sumry=summary(model), names=infl$names)

dfbeta(model, ...)

dfbeta.lm(model, infl=influence(model), names=infl$names)

dfbetas(model, ...)

dfbetas.lm(model, infl=influence(model), sumry=summary(model), names=infl$names)

influence(model, ...)

influence.lm(model)

influence.glm(model)

Arguments

model

lm or glm model object.

infl

optionally, an influence-object precomputed for the model by influence.

sumry

optionally, a summary-object precomputed for the model by summary.

names

optionally, a vector of observation names.

...

arguments to be passed down from generic functions to method functions.

Value

rstudent, hatvalues, and cookd return vectors with one entry for each observation; dfbeta and dfbetas return matrices with rows for observations and columns for coefficients. influence returns a list with entries:
namesobservation names.
hathat-values.
sigmaleave-one-out estimates of linear-model standard error or generalized-linear-model scale.
coefficientsdfbeta values.
wt.resweighted residuals (for a linear model).
dev.resdeviance residuals (for a generalized linear model).
pear.resPearson residuals (for a generalized linear model).

Details

Basic quantities are computed by influence.lm or influence.glm, which are slightly modified versions of lm.influence from the base package. Values for generalized linear models are approximations, as described in Williams (1987) (except that Cook's distances are scaled as F rather than as chi-square values). Normally, the generic versions of these functions are the ones to be used directly. For hatvalues, dfbeta, and dfbetas, the method for linear models also works for generalized linear models. The following diagnostics are provided: [object Object],[object Object],[object Object],[object Object],[object Object]

References

Belsley, D. A. and Kuh, E. and Welsch, R. E. (1980) Regression Diagnostics. Wiley. Cook, R. D. and Weisberg, S. (1984) Residuals and Influence in Regression. Wiley. Fox, J. (1997) Applied Regression, Linear Models, and Related Methods. Sage. Williams, D. A. (1987) Generalized linear model diagnostics using the deviance and single case deletions. Applied Statistics 36, 181--191.

Examples

Run this code

data(Duncan)
attach(Duncan)
mod <- lm(prestige ~ income + education)
qq.plot(rstudent(mod), distribution="t", df=41)
plot(hatvalues(mod))
plot(cookd(mod))
plot(dfbeta(mod)[,2])
plot(dfbetas(mod)[,2])

Run the code above in your browser using DataLab