diff_in_coef_test: Difference-in-Coefficients Test for Survey Weights

Description

Implements the Hausman-Pfeffermann Difference-in-Coefficients test to assess whether survey weights significantly affect regression estimates.

Usage

diff_in_coef_test(
  model,
  lower.tail = FALSE,
  var_equal = TRUE,
  robust_type = c("HC0", "HC1", "HC2", "HC3"),
  coef_subset = NULL,
  na.action = stats::na.omit
)
# S3 method for diff_in_coef_test
print(x, ...)
# S3 method for diff_in_coef_test
summary(object, ...)
# S3 method for diff_in_coef_test
tidy(x, ...)
# S3 method for diff_in_coef_test
glance(x, ...)

Value

An object of class "diff_in_coef_test" containing:

statistic: Chi-squared test statistic
parameter: Degrees of freedom
p.value: P-value for the test
betas_unweighted: Unweighted coefficient estimates
betas_weighted: Weighted coefficient estimates
vcov_diff: Estimated variance-covariance matrix of coefficient differences
diff_betas: Vector of coefficient differences
call: Function call

Arguments

model: An object of class svyglm.
lower.tail: Logical; passed to pchisq().
var_equal: Logical; assume equal residual variance between models. If FALSE, a heteroskedasticity-robust variance estimator is used.
robust_type: Character; type of heteroskedasticity-robust variance estimator to use if var_equal = FALSE. Options are "HC0", "HC1", "HC2", "HC3" as used in `sandwich` package.
coef_subset: Character vector of coefficient names to include in the test. Defaults to all coefficients.
na.action: Function to handle missing data before fitting the test.
x: An object of class diff_in_coef_test
...: Additional arguments passed to methods
object: An object of class diff_in_coef_test

Details

Let $X$ denote the design matrix and $y$ the response vector. Define the unweighted OLS estimator $$\hat\beta_{U} = (X^\top X)^{-1} X^\top y,$$ and the survey-weighted estimator $$\hat\beta_{W} = (X^\top W X)^{-1} X^\top W y,$$ where $W = \mathrm{diag}(w_1, \ldots, w_n)$ is the diagonal matrix of survey weights.

The test statistic is based on the difference $$d = \hat\beta_{W} - \hat\beta_{U}.$$

Under the null hypothesis that weights are not informative, $d$ has mean zero and variance $V_d$. The test statistic is $$T = d^\top V_d^{-1} d,$$ which is asymptotically $\chi^2_p$ distributed with $p$ equal to the number of coefficients tested.

If var_equal = TRUE, $V_d$ is estimated assuming equal residual variance across weighted and unweighted models. If var_equal = FALSE, a heteroskedasticity-robust estimator (e.g. HC0–HC3) is used.

This test is a survey-weighted adaptation of the Hausman specification test (Hausman, 1978), as proposed by Pfeffermann (1993).

References

Hausman, J. A. (1978). Specification Tests in Econometrics. *Econometrica*, 46(6), 1251-1271. tools:::Rd_expr_doi("10.2307/1913827")

Pfeffermann, D. (1993). The Role of Sampling Weights When Modeling Survey Data. *International Statistical Review*, 61(2), 317-337. tools:::Rd_expr_doi("10.2307/1403631")

Examples

Run this code

# Load in survey package (required) and load in example data
library(survey)
data(api, package = "survey")

# Create a survey design and fit a weighted regression model
des <- svydesign(id = ~1, strata = ~stype, weights = ~pw, data = apistrat)
fit <- svyglm(api00 ~ ell + meals, design = des)

# Run difference-in-coefficients diagnostic test versions with different variance assumptions
# and reports Chi-Squared statistic, df, and p-value
summary(diff_in_coef_test(fit, var_equal = TRUE))
summary(diff_in_coef_test(fit, var_equal = FALSE, robust_type = "HC3"))

Run the code above in your browser using DataLab