sim_binary_panel: Simulate Binary Panel Data with Staggered Treatment
Description
Generates a simulated panel dataset with staggered treatment adoption and
a binary outcome. Useful for testing and illustrating nonlinear DiD methods.
The data-generating process is:
$$Y_{it} = \mathbf{1}\{ \alpha_i + \lambda_t + \delta_{it} \cdot D_{it} + \epsilon_{it} > 0 \}$$
where \(\alpha_i\) is a unit fixed effect, \(\lambda_t\) is a time
fixed effect, \(\delta_{it}\) is the treatment effect (heterogeneous
across cohorts), and \(\epsilon_{it}\) is logistic noise.
A data frame in long format. Columns: id (unit identifier),
period (time period 1 to nperiods), y (binary outcome 0/1),
g (treatment cohort; 0 = never treated), D (treatment
indicator), x1 and x2 (covariates, if
add_covariates = TRUE), and alpha_i (true unit fixed effect,
for validation).
Arguments
n
Integer. Number of units. Default 500.
nperiods
Integer. Number of time periods. Default 6.
prop_treated
Numeric. Proportion of units ever treated. Default 0.5.
n_cohorts
Integer. Number of treatment cohorts (groups). Default 3.
true_att
Numeric or vector. True ATT for each cohort. Default 0.3.
base_prob
Numeric. Baseline probability P(Y=1) for untreated.
Default 0.3.
unit_fe_sd
Numeric. Std. dev. of unit fixed effects. Default 0.5.