stepVIF(model, threshold = 10, verbose = FALSE)
threshold = 10
.verbose = FALSE
.stepVIF
starts computing the VIF of all predictor variables in the
linear model. Because some predictor variables can have more than one degree
of freedom, such as categorical variables, generalized variance-inflation
factors (Fox and Monette, 1992) are calculated instead using
vif
. Generalized variance-inflation factors (GVIF)
consist of VIF corrected to the number of degrees of freedom (df) of the
predictor variable:$GVIF = VIF^{1/(2\times df)}$
GVIF are interpretable as the inflation in size of the confidence ellipse or ellipsoid for the coefficients of the predictor variable in comparison with what would be obtained for orthogonal data (Fox and Weisberg, 2011).
The next step is to evaluate if any of the predictor variables has a VIF
larger than the specified threshold. Because stepVIF
estimates GVIF
and the threshold corresponds to a VIF value, the last is transformed to the
scale of GVIF by taking its square root. If there is only one predictor
variable that does not meet the VIF threshold, it is authomatically removed
from the model and no further processing occurs. When there are two or more
predictor variables that do not meet the VIF threshold, stepVIF
fits
a linear model between each of them and the dependent variable. The
predictor variable with the lowest adjusted coefficient of determination is
dropped from the model and new coefficients are calculated, resulting in a
new linear model.
This process lasts until all predictor variables included in the new model meet the VIF threshold.
Nothing is done if all predictor variables have a VIF value inferior to the
threshold, and stepVIF
returns the original linear model.
Fox, J. (2008) Applied Regression Analysis and Generalized Linear Models, Second Edition. Sage.
Fox, J. and Weisberg, S. (2011) An R Companion to Applied Regression, Second Edition. Thousand Oaks: Sage.
Hair, J. F., Black, B., Babin, B. and Anderson, R. E. (2010) Multivariate data analysis. New Jersey: Pearson Prentice Hall.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
vif
, stepAIC
.require(car)
fit <- lm(prestige ~ income + education + type, data = Duncan)
fit <- stepVIF(fit, threshold = 10, verbose = TRUE)
Run the code above in your browser using DataLab