simultaneous_ci: Compute Simultaneous Confidence Intervals via Bootstrap (Post-Selection Inference)

Description

Implements Algorithm 1 from the reference paper using bootstrap-based max-t statistics to construct valid simultaneous confidence intervals for selected regression coefficients across a user-specified universe of linear models.

Usage

simultaneous_ci(
  X,
  y,
  Q_universe,
  alpha = 0.05,
  B = 1000,
  add_intercept = TRUE,
  bootstrap_method = "pairs",
  cores = 1,
  use_pbapply = TRUE,
  seed = NULL,
  verbose = TRUE,
  ...
)

Value

A list of class simultaneous_ci_result with elements:

intervals: Data frame with estimates, confidence intervals, variances, and SEs
K_alpha: Bootstrap (1 - alpha) quantile of max-t statistics
T_star_b: Vector of bootstrap max-t statistics
n_valid_T_star_b: Number of finite bootstrap max-t statistics
alpha, B, bootstrap_method: Metadata
warnings_list: Internal warnings collected during bootstrap/model fitting
valid_bootstrap_counts: Valid bootstrap replicates per parameter
n_bootstrap_errors: Total bootstrap fitting errors

Arguments

X: Numeric matrix (n x p): Design matrix. Must have unique column names. Do not include an intercept if add_intercept = TRUE.
y: Numeric vector (length n): Response vector.
Q_universe: Named list of numeric vectors. Each element specifies a model as a vector of column indices (accounting for intercept if add_intercept = TRUE). Names are used to identify each model in results.
alpha: Significance level for the confidence intervals. Default is 0.05.
B: Integer. Number of bootstrap samples. Default is 1000.
add_intercept: Logical. If TRUE, adds an intercept as the first column of the design matrix. Default is TRUE.
bootstrap_method: Character. Bootstrap type. Only "pairs" is currently supported.
cores: Integer. Number of CPU cores to use for bootstrap parallelization. Default is 1.
use_pbapply: Logical. Use pbapply for progress bars if available. Default is TRUE.
seed: Optional numeric. Random seed for reproducibility. Used for parallel-safe RNG.
verbose: Logical. Whether to display status messages. Default is TRUE.
...: Reserved for future use.

Details

Supports parallel execution, internal warnings capture, and returns structured results with estimates, intervals, bootstrap diagnostics, and inference statistics.

References

Kuchibhotla, A., Kolassa, J., & Kuffner, T. (2022). Post-selection inference. Annual Review of Statistics and Its Application, 9(1), 505–527.

Examples

Run this code

set.seed(123)
X <- matrix(rnorm(100 * 2), 100, 2, dimnames = list(NULL, c("X1", "X2")))
y <- X[,1] * 0.5 + rnorm(100)
Q <- list(model = 1:2)
res <- simultaneous_ci(X, y, Q, B = 100, cores = 1)
print(res$intervals)
plot(res)

Run the code above in your browser using DataLab