Learn R Programming

svydiags (version 0.3)

svyvif: Variance inflation factors (VIF) for linear models fitted with complex survey data

Description

Compute a VIF for fixed effects, linear regression models fitted with data collected from one- and two-stage complex survey designs.

Usage

svyvif(X, w, V)

Arguments

X

\(n \times p\) matrix of real-valued covariates used in fitting a linear regression; \(n\) = number of observations, \(p\) = number of covariates in model, excluding the intercept. A column of 1's for an intercept should not be included. X should not contain columns for the strata and cluster identifiers (unless those variables are part of the model). No missing values are allowed.

w

\(n\)-vector of survey weights used in fitting the model. No missing values are allowed.

V

\(n \times n\) covariance matrix of the residuals as estimated, e.g., using Vmat. No missing values are allowed.

Value

\(p \times 5\) matrix with columns:

svy.vif

complex sample VIF

reg.vif

standard VIF, \(1/(1 - R^2_k)\)

zeta

1st multiplicative adjustment to reg.vif

varrho

2nd multiplicative adjustment to reg.vif

zeta.x.varrho

product of the two adjustments to reg.vif

Details

svyvif computes a variance inflation factor (VIF) appropriate for a model fitted from complex survey data (see Liao & Valliant 2012). A VIF measures the inflation of a slope estimate caused by nonorthogonality of the predictors over and above what the variance would be with orthogonality (Theil 1971; Belsley, Kuh, and Welsch 1980). The standard VIF equals \(1/(1 - R^2_k)\) where \(R_k\) is the multiple correlation of the \(k^{th}\) column of X regressed on the remaining columns. The complex sample value of the VIF consists of the standard VIF multiplied by two adjustments denoted in the output as zeta and varrho. There is no widely agreed-upon cutoff value for identifying high values of a VIF.

References

Belsley, D.A., Kuh, E. and Welsch, R.E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: Wiley-Interscience.

Liao, D, and Valliant, R. (2012). Variance inflation factors in the analysis of complex survey data. Survey Methodology, 38, 53-62.

Theil, H. (1971). Principles of Econometrics. New York: John Wiley & Sons, Inc.

Lumley, T. (2010). Complex Surveys. New York: John Wiley & Sons.

Lumley, T. (2018). survey: analysis of complex survey samples. R package version 3.34.

See Also

Vmat

Examples

Run this code
# NOT RUN {
require(survey)
data(nhanes2007)
X1 <- nhanes2007[order(nhanes2007$SDMVSTRA, nhanes2007$SDMVPSU),]
    # eliminate cases with missing values
delete <- which(complete.cases(X1)==FALSE)
X2 <- X1[-delete,]
nhanes.dsgn <- svydesign(ids = ~SDMVPSU,
                         strata = ~SDMVSTRA,
                         weights = ~WTDRD1, nest=TRUE, data=X2)
m1 <- svyglm(BMXWT ~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL
            + DR1TTFAT + DR1TMFAT, design=nhanes.dsgn)
summary(m1)
V <- Vmat(mobj = m1,
          stvar = "SDMVSTRA",
          clvar = "SDMVPSU")
    # construct X matrix using model.matrix from stats package
X3 <- model.matrix(~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT,
        data = data.frame(X2))
    # remove col of 1's for intercept with X3[,-1]
svyvif(X = X3[,-1], w = X2$WTDRD1, V = V)
# }

Run the code above in your browser using DataLab