lm.case
calculates the statistics.
plot.case
plots the cases, one statistic per panel, and
illustrates and itemizes all observations for which the standard
thresholds are exceeded. plot.case
returns a "trellis"
object containing the plot and also places the row.names of the
flagged observations in the variable .lm.case.large
.
panel.case
is a panel function for plot.case
.lm.case(fit, lms = summary.lm(fit), lmi = lm.influence(fit))
plot.case(x, fit,
which=c("stu.res","si","h","cook","dffits",
dimnames(x)[[2]][-(1:8)]), ##DFBETAS
between.in=list(y=4, x=9),
oma=c(0,0,0,4), cex.threshold=if.R(s=2, r=1.2),
main.in=list(
paste(deparse(fit$call), collapse=""),
cex=main.cex),
sigma.in=summary.lm(fit)$sigma,
p.in=summary.lm(fit)$df[1]-1,
obs.large=".lm.case.large",
obs.large.env=if.R(r=globalenv(), s=0),
main.cex=NULL,
...)
panel.case(x, y, subscripts, rownames, group.names,
nn, pp, ss, cex.threshold,
panel.number, ## R only. S-Plus ignores this argument
par.settings, ## R only. S-Plus ignores this argument
obs.large, obs.large.env,
...)
"lm"
objectsummary.lm(fit)
lm.influence(fit)
plot.case
, the matrix output from lm.case
containing case diagnostics on each observation in the original
dataset.
In panel.case
, the x variable to be plottedplot.case
, the names of the columns of x
that are to be graphed.between
trellis/lattice argument.par()$oma
to make room for the
threshold values. A warning is printed when par()$oma
is changed as the delayed printing of trellis objects implies we can't
return it to the original value automaticallcex
for the threshold values.main
title for xyplot
. The default main title
displays the linear model formula from fit
.fit
..lm.case.large
.globalenv()
) where obs.large
will be stored.cex
for main title.xyplot
fit
.plot.case
before calling xyplot
.xyplot
.
Although this argument is not used in the panel function,
it is needed as a formal argument in S-Plus to absorb it out of ...
and thereby prevent it from being forwarded to lm.case
returns a matrix, with one row for each observation
in the original dataset. The columns contain the diagnostic statistics:
e
(residuals),
h
* (hat diagonals),
si
* (deleted standard deviation),
sta.res
(standardized residuals),
stu.res
* (Studentized deleted resididuals),
dffit
(difference in fits, change in predicted y when
observation i is deleted),
dffits
* (standardized difference in fits, standardized change
in predicted y when observation i is deleted),
cook
* (Cook's distance),
and DFBETAs* (standardized difference in regression coefficients when
observation i is deleted, one for each column of the x-matrix,
including the intercept).
plot.case
returns a "trellis"
object containing the plot
(including the starred columns by default) and also places the
row.names of the flagged observations in the variable
.lm.case.large
. The variable .lm.case.large
is placed
by default into frame 0 in S-Plus and into globalenv()
in R.
panel.case
is a panel function for plot.case
. The
variable .lm.case.large
is created one column at a time inside
the panel function.lm.influence
is part of S-Plus and R
lm.case
and plot.case
are based on:
Section 4.3.3 "Influence of Individual Obervations
in Chambers and Hastie", Statistical Models in S.lm.influence
in R,
lm.influence
in S-Plus.kidney <- read.table(hh("datasets/kidney.dat"), header=TRUE)
kidney2.lm <- lm(clearance ~ concent + age + weight + concent*age,
data=kidney)
kidney2.case <- lm.case(kidney2.lm)
## this picture looks much better in portrait, spcification is device dependent
## trellis.device(postscript, horizontal=TRUE) ## postscript
## trellis.device(orientation="portrait") ## S-Plus graphsheet
plot.case(kidney2.case, kidney2.lm, par.strip.text=list(cex=.9),
layout=c(2,3))
.lm.case.large ## file placed by default into frame 0 in S-Plus
## and into globalenv() in R
Run the code above in your browser using DataCamp Workspace