which.influence: Which Observations are Influential

Description

Creates a list with a component for each factor in the model. The names of the components are the factor names. Each component contains the observation identifiers of all observations that are "overly influential" with respect to that factor, meaning that $|dfbetas| > u$ for at least one $\beta_i$ associated with that factor, for a given cutoff. The default cutoff is .2. The fit must come from a function that has resid(fit, type="dfbetas") defined.

show.influence, written by Jens Oehlschlaegel-Akiyoshi, applies the result of which.influence to a data frame, usually the one used to fit the model, to report the results.

Usage

which.influence(fit, cutoff=.2)
show.influence(object, dframe, report=NULL, sig=NULL, id=NULL)

Arguments

fit

fit object

object

the result of which.influence

dframe

data frame containing observations pertinent to the model fit

cutoff

cutoff value

report

other columns of the data frame to report besides those corresponding to predictors that are influential for some observations

sig

runs results through signif with sig digits if sig is given

a character vector that labels rows of dframe if row.names were not used

Value

show.influence returns a marked dataframe with the first column being a count of influence values

concept

logistic regression model

Examples

Run this code

#print observations in data frame that are influential,
#separately for each factor in the model
x1 <- 1:20
x2 <- abs(x1-10)
x3 <- factor(rep(0:2,length.out=20))
y  <- c(rep(0:1,8),1,1,1,1)
f  <- lrm(y ~ rcs(x1,3) + x2 + x3, x=TRUE,y=TRUE)
w <- which.influence(f, .55)
nam <- names(w)
d   <- data.frame(x1,x2,x3,y)
for(i in 1:length(nam)) {
 print(paste("Influential observations for effect of ",nam[i]),quote=FALSE)
 print(d[w[[i]],])
}

show.influence(w, d)  # better way to show results

Run the code above in your browser using DataLab