pred_power_test: Pfeffermann-Nathan Predictive Power Test (svyglm, K-fold CV, fold-mean option) (In production)

Description

Implements the predictive power test following Wang et al. (2023, Sec. 2.2): split observations into estimation and validation sets; fit unweighted and weighted linear regressions on the estimation set; compute validation squared-error differences \(D_i = (y_i - \hat y_{u,i})^2 - (y_i - \hat y_{w,i})^2\); test \(H_0: E[D_i] = 0\) with \(Z = \bar D / (s_D / \sqrt{n_V})\). Supports K-fold CV (default) and a "fold-mean" option to reduce dependence among errors by using per-fold means as the test observations.

Usage

pred_power_test(
  model,
  kfold = TRUE,
  K = 5,
  est_split = 0.5,
  use_fold_means = TRUE,
  seed = NULL
)
# S3 method for pred_power_test
print(x, ...)
# S3 method for pred_power_test
summary(object, ...)
# S3 method for pred_power_test
tidy(x, ...)
# S3 method for pred_power_test
glance(x, ...)

Value

An object of class "pred_power_test" with fields:

statistic: Z statistic
p.value: Two-sided p-value
mean_diff: Mean of \(D\) (fold mean if use_fold_means = TRUE)
n_val: Count of observations used in Z (\(K\) if use_fold_means = TRUE, else total validation n)
K: Number of folds (if kfold = TRUE)
method: Description string
call: Matched call

Arguments

model: A fitted svyglm with family = gaussian(identity).
kfold: Logical; if TRUE, use K-fold cross-validation (default TRUE).
K: Integer number of folds (default 5).
est_split: Proportion for estimation set if kfold = FALSE (default 0.5).
use_fold_means: Logical; if TRUE (default), compute one \(D\) per fold as the mean of within-fold \(D_i\), then form \(Z\) using the \(K\) fold means. This stabilizes the test by reducing dependence noted in Wang (2023).
seed: Optional integer seed for reproducibility.
x: An object of class pred_power_test
...: Additional arguments passed to methods
object: An object of class pred_power_test