HH (version 3.1-43)

lm.case: case statistics for regression analysis

Description

Case statistics for regression analysis. case.lm calculates the statistics. plot.case plots the cases, one statistic per panel, and illustrates and flags all observations for which the standard thresholds are exceeded. plot.case returns an object with class c("trellis.case", "trellis") containing the plot and the row.names of the flagged observations. The object is printed by a method which displays the set of graphs and prints the list of flagged cases. panel.case is a panel function for plot.case.

Usage

case(fit, ...)
# S3 method for lm
case(fit, lms = summary.lm(fit), lmi = lm.influence(fit), ...)

# S3 method for case plot(x, fit, which=c("stu.res","si","h","cook","dffits", dimnames(x)[[2]][-(1:8)]), ##DFBETAS between.in=list(y=4, x=9), cex.threshold=1.2, main.in=list( paste(deparse(fit$call), collapse=""), cex=main.cex), sigma.in=summary.lm(fit)$sigma, p.in=summary.lm(fit)$df[1]-1, main.cex=NULL, ...)

panel.case(x, y, subscripts, rownames, group.names, thresh, case.large, nn, pp, ss, cex.threshold, ...)

Arguments

fit

"lm" object computed with x=TRUE

lms

summary.lm(fit)

lmi

lm.influence(fit)

x

In plot.case, the matrix output from case.lm containing case diagnostics on each observation in the original dataset. In panel.case, the x variable to be plotted

which

In plot.case, the names of the columns of x that are to be graphed.

between.in

between trellis/lattice argument.

cex.threshold

Multiplier for cex for the threshold values.

main.in

main title for xyplot. The default main title displays the linear model formula from fit.

sigma.in

standard error for the fit.

p.in

The number of degrees of freedom associated with the fitted model.

main.cex

cex for main title.

other arguments to xyplot

y

the y variable to be plotted.

thresh

Named list of lists. Each list contains the components threshold ($y$-locations where a reference line will be drawn), thresh.label (the right-axis labels for the reference lines), thresh.id (the bounds defining "Noteworthy Observations").

case.large

Named list of "Noteworthy Observations".

nn

Number of rows in original dataset.

pp

The number of degrees of freedom associated with the fitted model.

ss

Standard error for the fit.

subscripts

trellis/lattice argument, position in the reshaped dataset constructed by plot.case before calling xyplot.

rownames

row name in the original data.frame.

group.names

names of the individual statistics.

Value

case.lm returns a matrix, with one row for each observation in the original dataset. The columns contain the diagnostic statistics: e (residuals), h* (hat diagonals), si* (deleted standard deviation), sta.res (standardized residuals), stu.res* (Studentized deleted resididuals), dffit (difference in fits, change in predicted y when observation i is deleted), dffits* (standardized difference in fits, standardized change in predicted y when observation i is deleted), cook* (Cook's distance), and DFBETAs* (standardized difference in regression coefficients when observation i is deleted, one for each column of the x-matrix, including the intercept).

plot.case returns a c("trellis.case", "trellis") object containing the plot (including the starred columns by default) and also retains the row.names of the flagged observations in the $panel.args.common$case.large component. The print method for the c("trellis.case", "trellis") object prints the graph and the list of flagged observations.

panel.case is a panel function for plot.case.

Details

lm.influence is part of S-Plus and R case.lm and plot.case are based on: Section 4.3.3 "Influence of Individual Obervations in Chambers and Hastie", Statistical Models in S.

References

Heiberger, Richard M. and Holland, Burt (2015). Statistical Analysis and Data Display: An Intermediate Course with Examples in R. Second Edition. Springer-Verlag, New York. https://www.springer.com/us/book/9781493921218

See Also

lm.influence.

Examples

Run this code
# NOT RUN {
data(kidney)

kidney2.lm <- lm(clearance ~ concent + age + weight + concent*age,
                 data=kidney,
                 na.action=na.exclude)  ## recommended

kidney2.case <- case(kidney2.lm)

## this picture looks much better in portrait, specification is device dependent

plot(kidney2.case, kidney2.lm, par.strip.text=list(cex=.9),
     layout=c(2,3))
# }

Run the code above in your browser using DataCamp Workspace