nonlinear_attgt: Nonlinear Staggered DiD: Group-Time ATT Estimation

Description

Computes group-time average treatment effects on the treated (ATT(g,t)) for staggered difference-in-differences designs with nonlinear outcomes.

This function extends Callaway & Sant'Anna (2021) to handle binary, count, and other nonlinear outcomes where the standard linear parallel trends assumption is misspecified. The key methodological contributions are:

Parallel trends on the latent index (for logit/probit): Instead of assuming parallel trends in \(E[Y]\), we assume parallel trends in the latent utility \(F^{-1}(E[Y])\).
Doubly-robust nonlinear estimator: Combines outcome regression (nonlinear model) with propensity score weighting, inheriting DR properties in the nonlinear setting.
Odds-ratio DiD: A scale-free estimand appropriate for binary outcomes that does not require parallel trends in probabilities.
Nonparametric bounds: When no functional form is assumed, provides sharp bounds on ATT(g,t).

Usage

nonlinear_attgt(
  data,
  yname,
  tname,
  idname,
  gname,
  xformla = ~1,
  outcome_model = c("logit", "probit", "poisson", "negbin", "linear"),
  estimand = c("att", "odds_ratio", "ape"),
  control_group = c("nevertreated", "notyetreated"),
  doubly_robust = TRUE,
  boot = FALSE,
  nboot = 999,
  boot_type = c("multiplier", "empirical"),
  alpha = 0.05,
  parallel = FALSE,
  pl_cores = 2L,
  anticipation = 0L
)

Value

An object of class nonlinear_attgt containing:

attgt: Data frame of ATT(g,t) estimates, standard errors, and confidence intervals for each (group, time) pair.
call: The matched call.
args: List of arguments used.
boot_draws: Matrix of bootstrap draws (if boot = TRUE).

Arguments

data

A data frame in long format (one row per unit-period).

yname

Character. Name of the outcome variable column.

tname

Character. Name of the time period column.

idname

Character. Name of the unit identifier column.

gname

Character. Name of the treatment cohort column (the period when a unit first receives treatment; 0 or Inf for never-treated units).

xformla

A one-sided formula for covariates (e.g., ~ x1 + x2). Default is ~ 1 (intercept only).

outcome_model

Character. The outcome model to use. One of:

"logit": Logistic regression (for binary Y)
"probit": Probit regression (for binary Y)
"poisson": Poisson regression (for count Y)
"negbin": Negative binomial (for overdispersed count Y)
"linear": Linear model (reproduces CS2021 when combined with doubly_robust = TRUE)

estimand

Character. The treatment effect estimand:

"att": Average treatment effect on the treated (default)
"odds_ratio": Odds ratio DiD (binary outcomes only)
"ape": Average partial effect on the probability scale

control_group

Character. Which units serve as the control group:

"nevertreated": Use never-treated units only (default)
"notyetreated": Use not-yet-treated units

doubly_robust

Logical. If TRUE (default), uses the doubly-robust estimator that combines propensity score weighting with outcome regression. More robust to model misspecification.

boot

Logical. If TRUE, uses bootstrap for inference. Default FALSE.

nboot

Integer. Number of bootstrap iterations. Default 999.

boot_type

Character. Type of bootstrap: "multiplier" (default, fast) or "empirical".

alpha

Numeric. Significance level for confidence intervals. Default 0.05.

parallel

Logical. Use parallel processing for bootstrap. Default FALSE.

pl_cores

Integer. Number of cores for parallel processing.

anticipation

Integer. Number of periods of anticipation allowed. Default 0.

References

Callaway, B., & Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200-230.

Wooldridge, J. M. (2023). Simple approaches to nonlinear difference-in-differences with panel data. The Econometrics Journal, 26(3).

Roth, J., & Sant'Anna, P. H. C. (2023). When is parallel trends sensitive to functional form? Econometrica, 91(2), 737-747.

Examples

Run this code

# Simulate binary panel data
set.seed(42)
dat <- sim_binary_panel(n = 500, nperiods = 6, prop_treated = 0.4)

# Estimate ATT(g,t) with logistic outcome model
result <- nonlinear_attgt(
  data = dat,
  yname = "y",
  tname = "period",
  idname = "id",
  gname = "g",
  outcome_model = "logit",
  control_group = "nevertreated"
)

summary(result)
plot(result)

Run the code above in your browser using DataLab