Calculate the Variance Inflation Factor
The VIF for predictor $i$ is $1/(1-R_i^2)$, where $R_i^2$ is the $R^2$ from a regression of predictor $i$ against the remaining predictors.
vif(x, ...) ## S3 method for class 'default': vif(x, y.name, na.action = na.exclude, ...) ## x is a data.frame ## S3 method for class 'formula': vif(x, data, na.action = na.exclude, ...) ## x is a formula ## S3 method for class 'lm': vif(x, na.action = na.exclude, ...) ## x is a "lm" object computed with x=TRUE
lmobject computed with
- additional arguments.
- Name of Y-variable to be excluded from the computations.
- A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way.
A simple diagnostic of collinearity is the variance inflation factor, VIF one for each regression coefficient (other than the intercept). Since the condition of collinearity involves the predictors but not the response, this measure is a function of the $X$'s but not of $Y$. The VIF for predictor $i$ is $1/(1-R_i^2)$, where $R_i^2$ is the $R^2$ from a regression of predictor $i$ against the remaining predictors. If $R_i^2$ is close to 1, this means that predictor $i$ is well explained by a linear function of the remaining predictors, and, therefore, the presence of predictor $i$ in the model is redundant. Values of VIF exceeding 5 are considered evidence of collinearity: The information carried by a predictor having such a VIF is contained in a subset of the remaining predictors. If, however, all of a model's regression coefficients differ significantly from 0 ($p$-value $
- Vector of VIF values, one for each X-variable.
Heiberger, Richard~M. and Holland, Burt (2004b). Statistical Analysis and Data Display: An Intermediate Course with Examples in S-Plus, R, and SAS. Springer Texts in Statistics. Springer. ISBN 0-387-40270-5.
usair <- read.table(hh("datasets/usair.dat"), col.names=c("SO2","temp","mfgfirms","popn", "wind","precip","raindays")) usair$lnSO2 <- log(usair$SO2) usair$lnmfg <- log(usair$mfgfirms) usair$lnpopn <- log(usair$popn) usair.lm <- lm(lnSO2 ~ temp + lnmfg + wind + precip, data=usair, x=TRUE) vif(usair.lm) ## the lm object must be computed with x=TRUE vif(lnSO2 ~ temp + lnmfg + wind + precip, data=usair) vif(usair) vif(usair, y.name="lnSO2")