lm.case calculates the statistics.
plot.case plots the cases, one statistic per panel, and
illustrates and itemizes all observations for which the standard
thresholds are exceeded. plot.case returns a "trellis"
object containing the plot and also places the row.names of the
flagged observations in the variable .lm.case.large.
panel.case is a panel function for plot.case.lm.case(fit, lms = summary.lm(fit), lmi = lm.influence(fit))
plot.case(x, fit,
which=c("stu.res","si","h","cook","dffits",
dimnames(x)[[2]][-(1:8)]), ##DFBETAS
between.in=list(y=4, x=9),
oma=c(0,0,0,4), cex.threshold=if.R(s=2, r=1.2),
main.in=list(
paste(deparse(fit$call), collapse=""),
cex=main.cex),
sigma.in=summary.lm(fit)$sigma,
p.in=summary.lm(fit)$df[1]-1,
obs.large=".lm.case.large",
obs.large.env=if.R(r=globalenv(), s=0),
main.cex=NULL,
...)
panel.case(x, y, subscripts, rownames, group.names,
nn, pp, ss, cex.threshold,
panel.number, ## R only. S-Plus ignores this argument
par.settings, ## R only. S-Plus ignores this argument
obs.large, obs.large.env,
...)"lm" objectsummary.lm(fit)lm.influence(fit)plot.case, the matrix output from lm.case
containing case diagnostics on each observation in the original
dataset.
In panel.case, the x variable to be plottedplot.case, the names of the columns of x
that are to be graphed.between trellis/lattice argument.par()$oma to make room for the
threshold values. A warning is printed when par()$oma
is changed as the delayed printing of trellis objects implies we can't
return it to the original value automaticallcex for the threshold values.main title for xyplot. The default main title
displays the linear model formula from fit.fit..lm.case.large.globalenv()) where obs.large will be stored.cex for main title.xyplotfit.plot.case before calling xyplot.xyplot.
Although this argument is not used in the panel function,
it is needed as a formal argument in S-Plus to absorb it out of ...
and thereby prevent it from being forwarded to lm.case returns a matrix, with one row for each observation
in the original dataset. The columns contain the diagnostic statistics:
e (residuals),
h* (hat diagonals),
si* (deleted standard deviation),
sta.res (standardized residuals),
stu.res* (Studentized deleted resididuals),
dffit (difference in fits, change in predicted y when
observation i is deleted),
dffits* (standardized difference in fits, standardized change
in predicted y when observation i is deleted),
cook* (Cook's distance),
and DFBETAs* (standardized difference in regression coefficients when
observation i is deleted, one for each column of the x-matrix,
including the intercept).
plot.case returns a "trellis" object containing the plot
(including the starred columns by default) and also places the
row.names of the flagged observations in the variable
.lm.case.large. The variable .lm.case.large is placed
by default into frame 0 in S-Plus and into globalenv() in R.
panel.case is a panel function for plot.case. The
variable .lm.case.large is created one column at a time inside
the panel function.lm.influence is part of S-Plus and R
lm.case and plot.case are based on:
Section 4.3.3 "Influence of Individual Obervations
in Chambers and Hastie", Statistical Models in S.lm.influence in R,
lm.influence in S-Plus.kidney <- read.table(hh("datasets/kidney.dat"), header=TRUE)
kidney2.lm <- lm(clearance ~ concent + age + weight + concent*age,
data=kidney)
kidney2.case <- lm.case(kidney2.lm)
## this picture looks much better in portrait, spcification is device dependent
## trellis.device(postscript, horizontal=TRUE) ## postscript
## trellis.device(orientation="portrait") ## S-Plus graphsheet
plot.case(kidney2.case, kidney2.lm, par.strip.text=list(cex=.9),
layout=c(2,3))
.lm.case.large ## file placed by default into frame 0 in S-Plus
## and into globalenv() in RRun the code above in your browser using DataLab