Computes group-time average treatment effects on the treated (ATT(g,t)) for staggered difference-in-differences designs with nonlinear outcomes.
This function extends Callaway & Sant'Anna (2021) to handle binary, count, and other nonlinear outcomes where the standard linear parallel trends assumption is misspecified. The key methodological contributions are:
Parallel trends on the latent index (for logit/probit): Instead of assuming parallel trends in \(E[Y]\), we assume parallel trends in the latent utility \(F^{-1}(E[Y])\).
Doubly-robust nonlinear estimator: Combines outcome regression (nonlinear model) with propensity score weighting, inheriting DR properties in the nonlinear setting.
Odds-ratio DiD: A scale-free estimand appropriate for binary outcomes that does not require parallel trends in probabilities.
Nonparametric bounds: When no functional form is assumed, provides sharp bounds on ATT(g,t).
nonlinear_attgt(
data,
yname,
tname,
idname,
gname,
xformla = ~1,
outcome_model = c("logit", "probit", "poisson", "negbin", "linear"),
estimand = c("att", "odds_ratio", "ape"),
control_group = c("nevertreated", "notyetreated"),
doubly_robust = TRUE,
boot = FALSE,
nboot = 999,
boot_type = c("multiplier", "empirical"),
alpha = 0.05,
parallel = FALSE,
pl_cores = 2L,
anticipation = 0L
)An object of class nonlinear_attgt containing:
Data frame of ATT(g,t) estimates, standard errors, and confidence intervals for each (group, time) pair.
The matched call.
List of arguments used.
Matrix of bootstrap draws (if boot = TRUE).
A data frame in long format (one row per unit-period).
Character. Name of the outcome variable column.
Character. Name of the time period column.
Character. Name of the unit identifier column.
Character. Name of the treatment cohort column (the period when a unit first receives treatment; 0 or Inf for never-treated units).
A one-sided formula for covariates (e.g., ~ x1 + x2).
Default is ~ 1 (intercept only).
Character. The outcome model to use. One of:
"logit": Logistic regression (for binary Y)
"probit": Probit regression (for binary Y)
"poisson": Poisson regression (for count Y)
"negbin": Negative binomial (for overdispersed count Y)
"linear": Linear model (reproduces CS2021 when combined
with doubly_robust = TRUE)
Character. The treatment effect estimand:
"att": Average treatment effect on the treated (default)
"odds_ratio": Odds ratio DiD (binary outcomes only)
"ape": Average partial effect on the probability scale
Character. Which units serve as the control group:
"nevertreated": Use never-treated units only (default)
"notyetreated": Use not-yet-treated units
Logical. If TRUE (default), uses the doubly-robust estimator that combines propensity score weighting with outcome regression. More robust to model misspecification.
Logical. If TRUE, uses bootstrap for inference. Default FALSE.
Integer. Number of bootstrap iterations. Default 999.
Character. Type of bootstrap: "multiplier"
(default, fast) or "empirical".
Numeric. Significance level for confidence intervals. Default 0.05.
Logical. Use parallel processing for bootstrap. Default FALSE.
Integer. Number of cores for parallel processing.
Integer. Number of periods of anticipation allowed. Default 0.
Callaway, B., & Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200-230.
Wooldridge, J. M. (2023). Simple approaches to nonlinear difference-in-differences with panel data. The Econometrics Journal, 26(3).
Roth, J., & Sant'Anna, P. H. C. (2023). When is parallel trends sensitive to functional form? Econometrica, 91(2), 737-747.
# Simulate binary panel data
set.seed(42)
dat <- sim_binary_panel(n = 500, nperiods = 6, prop_treated = 0.4)
# Estimate ATT(g,t) with logistic outcome model
result <- nonlinear_attgt(
data = dat,
yname = "y",
tname = "period",
idname = "id",
gname = "g",
outcome_model = "logit",
control_group = "nevertreated"
)
summary(result)
plot(result)
Run the code above in your browser using DataLab