run_synth_bsts: Synthetic Control via BSTS (CausalImpact)

Description

Builds a simple synthetic-control-style analysis using CausalImpact/BSTS for either I or C as the outcome, with treatment defined endogenously by a high level of a chosen control variable.

Usage

run_synth_bsts(DT, outcome = c("I", "C"), control_var, seed = 123)

Value

On success, a list with components:

impact: the full CausalImpact object.
summary: a data.frame with the mean absolute and relative effects.

If the treated period is too short or the model fit fails, the function returns NULL.

Arguments

DT

A data.frame or data.table containing at least:

I, C: outcome candidates (counts or rates).
EconCycle, PopDensity, Epidemics, Climate, War, t_norm: predictors used to build the synthetic control.
The column named in control_var, used to define the treated period.

outcome

Character; which outcome series to use as the response, one of "I" or "C".

control_var

Character scalar; name of a column in DT whose high values define the treated period (e.g., intensity of some intervention or shock proxy).

seed

Integer; random seed for reproducibility of the BSTS fit.

Details

The function:

Selects the outcome series y <- DT[[outcome]].
Builds the predictor matrix from EconCycle, PopDensity, Epidemics, Climate, War, and t_norm.
Uses control_var to define a treated period as observations where control_var is in the top third (>= 2/3 quantile). If fewer than 5 treated observations are found, the function returns NULL.
Sets the intervention start time t0 as one period before the first treated index (with a minimum of 10 observations in the pre-period). The pre- and post-intervention windows are: pre.period = c(1, t0) and post.period = c(t0 + 1, length(y)).
Calls CausalImpact::CausalImpact() on the combined cbind(y, preds) matrix, with model.args = list(nseasons = 1).

From the resulting impact object, the function extracts the average absolute and relative effects from impact$summary and stores them in a small summary table with two rows: "abs_effect_mean" and "rel_effect_mean".

A CSV file named "causalimpact_<control_var>_on_<outcome>.csv" is written to the directory specified by a global character scalar dir_csv. If CausalImpact() fails, the function returns NULL.

Examples

Run this code

# \donttest{
library(data.table)

# 1. Create dummy data with ALL required predictors
# The function explicitly selects: EconCycle, PopDensity, Epidemics, Climate, War, t_norm
DT <- data.table(
  year = 2000:2029,
  I = rpois(30, lambda = 10),
  C = rpois(30, lambda = 8),
  # Predictors required by run_synth_bsts internal selection
  EconCycle = rnorm(30),
  PopDensity = rnorm(30),
  Epidemics = rnorm(30),
  Climate = rnorm(30),
  War = rnorm(30),
  t_norm = seq(-1, 1, length.out = 30)
)

# 2. Define global paths using tempdir() (Fixes CRAN policy)
# run_synth_bsts writes output to 'dir_csv'
tmp_dir <- tempdir()
dir_csv <- file.path(tmp_dir, "csv")
if (!dir.exists(dir_csv)) dir.create(dir_csv, recursive = TRUE)

# 3. Run the function
# We use "War" as the control variable to define the treatment period
res_I <- run_synth_bsts(DT, outcome = "I", control_var = "War", seed = 123)

# Inspect results if successful (might return NULL if fit fails or not enough data)
if (!is.null(res_I)) {
  print(res_I$summary)
}
# }

Run the code above in your browser using DataLab