run_eba: Extreme-Bounds Analysis (EBA) over Control-Variable Combinations

Description

Runs an Extreme-Bounds Analysis (EBA) over a predefined set of control variable combinations, fitting (or re-fitting) the bivariate hurdle model for each combination and extracting posterior mean coefficients for all regression blocks (mu_I, pi_I, mu_C, pi_C).

Usage

run_eba(DT, spec = "C", k_bma_table = NULL, seed = 123)

Value

A data.frame with the columns:

name: name of the covariate (design-matrix column).
mean: posterior mean of the corresponding coefficient.
block: block identifier ("mu_I", "pi_I", "mu_C", "pi_C").
combo: control-combination tag used for that fit.

Arguments

DT: A data.table or data.frame with the data passed to fit_one().
spec: Character scalar; model specification (e.g.\ "A", "B", "C", "D") passed to fit_one().
k_bma_table: Optional object (typically a named list or list-like structure) indexed by control-combination tags that indicates for which combinations a BMA selection table already exists. If k_bma_table[[tag]] is NULL or bma_weights_* CSV is missing, the function falls back to a default fit with k = 2 and default horseshoe hyperparameters.
seed: Integer; base random seed for the fits. For different control combinations, the seed is jittered to avoid identical pseudo-random sequences.

Details

The function assumes the existence of:

control_combos: a named object whose names are control tags (e.g.\ "None", "X1+X2", "X1+X3+X4"), defining which control sets to explore.
dir_csv: a character scalar with the directory where CSV files will be read/written.
fit_one(): a function that fits a single bivariate hurdle model and returns at least $fit (CmdStanR fit) and $des (design matrices).

For each control-combination tag tag:

If a BMA weights file "bma_weights_spec<spec>_ctrl<tag>.csv" exists in dir_csv and k_bma_table[[tag]] is not NULL, the top-weighted row (highest weight) is used to select k and horseshoe hyperparameters (hs_tau0, hs_slab_scale, hs_slab_df) for the fit.
Otherwise, the model is fit with k = 2 and default horseshoe hyperparameters.
Posterior means of the regression coefficients with prefixes "b_mu_I", "b_pi_I", "b_mu_C", "b_pi_C" are extracted and mapped back to the corresponding column names of the design matrices.

All coefficient summaries are stacked into a single table and written to "eba_coefficients.csv" in dir_csv.

Examples

Run this code

# \donttest{
library(data.table)

# 1. Create a COMPLETE dummy dataset
# This satisfies ALL requirements of build_design() and fit_one()
DT <- data.table(
  year = 2000:2020,
  # Dependent variables (Raw)
  I = rpois(21, lambda = 4),
  C = rpois(21, lambda = 3),
  # Dependent variables (Standardized/Transformed - required by build_design)
  zI = rnorm(21),
  zC = rnorm(21),
  # Trend variables (required if include_trend=TRUE)
  t_norm = seq(-1, 1, length.out = 21),
  t_poly2 = seq(-1, 1, length.out = 21)^2,
  # Regime (required if include_regimes=TRUE)
  Regime = factor(sample(c("A", "B"), 21, replace = TRUE)),
  # Transition dummies (required if include_transitions=TRUE)
  # Specifically: trans_PS, trans_SF, trans_FC
  trans_PS = sample(0:1, 21, replace = TRUE),
  trans_SF = sample(0:1, 21, replace = TRUE),
  trans_FC = sample(0:1, 21, replace = TRUE),
  # Exposure offset (required by fit_one)
  log_exposure50 = rep(0, 21),
  # Control variables (used in this specific example)
  X1 = rnorm(21),
  X2 = rnorm(21),
  X3 = rnorm(21)
)

# 2. Define global objects required by run_eba
control_combos <- list(
  None      = character(0),
  "X1+X2"   = c("X1", "X2"),
  "X1+X2+X3"= c("X1", "X2", "X3")
)

# 3. Define global paths using tempdir()
tmp_dir <- tempdir()
dir_csv <- file.path(tmp_dir, "csv")
if (!dir.exists(dir_csv)) dir.create(dir_csv, recursive = TRUE)

# 4. Run the function
# Note: This will attempt to run Stan. If CmdStan is not configured,
# it might fail later, but the DATA error is fixed.
try({
  eba_tab <- run_eba(DT, spec = "C", k_bma_table = list(), seed = 123)
  print(head(eba_tab))
})
# }

Run the code above in your browser using DataLab