Calculate two-stage difference-in-differences following Gardner (2021)
did2s(
data,
yname,
first_stage,
second_stage,
treatment,
cluster_var,
weights = NULL,
bootstrap = FALSE,
n_bootstraps = 250,
return_bootstrap = FALSE,
verbose = FALSE
)fixest object with adjusted standard errors
(either by formula or by bootstrap). All the methods from fixest package
will work, including fixest::esttable and
fixest::coefplot
The dataframe containing all the variables
Outcome variable
Fixed effects and other covariates you want to residualize
with in first stage.
Formula following fixest::feols.
Fixed effects specified after "|".
Second stage, these should be the treatment indicator(s)
(e.g. treatment variable or event-study leads/lags).
Formula following fixest::feols.
Use i() for factor variables, see fixest::i.
A variable that = 1 if treated, = 0 otherwise. The first
stage will be estimated for treatment == 0. The second stage will be
estimated for the full sample.
What variable to cluster standard errors. This can be IDs or a higher aggregate level (state for example)
Optional. Variable name for regression weights.
Optional. Should standard errors be calculated using bootstrap?
Default is FALSE.
Optional. How many bootstraps to run.
Default is 250.
Optional. Logical. Will return each bootstrap second-stage estimate to allow for manual use, e.g. percentile standard errors and empirical confidence intervals.
Optional. Logical. Should information about the two-stage
procedure be printed back to the user?
Default is TRUE.
Load example dataset which has two treatment groups and homogeneous treatment effects
# Load Example Dataset
data("df_hom")
You can run a static TWFE fixed effect model for a simple treatment indicator
static <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(treat, ref=FALSE))fixest::esttable(static)
#> static
#> Dependent Var.: dep_var
#>
#> treat = TRUE 2.005*** (0.0202)
#> _______________ _________________
#> S.E.: Clustered by: state
#> Observations 46,500
#> R2 0.47520
#> Adj. R2 0.47520
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Or you can use relative-treatment indicators to estimate an event study estimate
es <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(rel_year, ref=c(-1, Inf)))fixest::esttable(es)
#> es
#> Dependent Var.: dep_var
#>
#> rel_year = -20 0.0043 (0.0322)
#> rel_year = -19 0.0222 (0.0296)
#> rel_year = -18 -0.0358 (0.0308)
#> rel_year = -17 0.0043 (0.0337)
#> rel_year = -16 -0.0186 (0.0353)
#> rel_year = -15 -0.0045 (0.0346)
#> rel_year = -14 -0.0393 (0.0384)
#> rel_year = -13 0.0453 (0.0323)
#> rel_year = -12 0.0324 (0.0309)
#> rel_year = -11 -0.0245 (0.0349)
#> rel_year = -10 -0.0017 (0.0241)
#> rel_year = -9 0.0155 (0.0242)
#> rel_year = -8 -0.0073 (0.0210)
#> rel_year = -7 -0.0513* (0.0202)
#> rel_year = -6 0.0269 (0.0237)
#> rel_year = -5 0.0136 (0.0237)
#> rel_year = -4 0.0381. (0.0223)
#> rel_year = -3 -0.0228 (0.0284)
#> rel_year = -2 0.0041 (0.0228)
#> rel_year = 0 1.971*** (0.0470)
#> rel_year = 1 2.050*** (0.0466)
#> rel_year = 2 2.033*** (0.0441)
#> rel_year = 3 1.966*** (0.0400)
#> rel_year = 4 1.965*** (0.0430)
#> rel_year = 5 2.030*** (0.0456)
#> rel_year = 6 2.040*** (0.0447)
#> rel_year = 7 1.995*** (0.0370)
#> rel_year = 8 2.019*** (0.0485)
#> rel_year = 9 1.955*** (0.0468)
#> rel_year = 10 1.950*** (0.0455)
#> rel_year = 11 2.117*** (0.0664)
#> rel_year = 12 2.132*** (0.0741)
#> rel_year = 13 2.019*** (0.0640)
#> rel_year = 14 2.013*** (0.0522)
#> rel_year = 15 1.961*** (0.0605)
#> rel_year = 16 1.916*** (0.0584)
#> rel_year = 17 1.938*** (0.0607)
#> rel_year = 18 2.070*** (0.0666)
#> rel_year = 19 2.066*** (0.0609)
#> rel_year = 20 1.964*** (0.0612)
#> _______________ _________________
#> S.E.: Clustered by: state
#> Observations 46,500
#> R2 0.47577
#> Adj. R2 0.47533
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# plot rel_year coefficients and standard errors
fixest::coefplot(es, keep = "rel_year::(.*)")
Here's an example using data from Cheng and Hoekstra (2013)
# Castle Data
castle <- haven::read_dta("https://github.com/scunning1975/mixtape/raw/master/castle.dta")did2s(
data = castle,
yname = "l_homicide",
first_stage = ~ 0 | sid + year,
second_stage = ~ i(post, ref=0),
treatment = "post",
cluster_var = "state", weights = "popwt"
)
#> OLS estimation, Dep. Var.: l_homicide
#> Observations: 550
#> Weights: weights_vector
#> Standard-errors: Corrected Clustered (state)
#> Estimate Std. Error t value Pr(>|t|)
#> post::1 0.075142 0.03538 2.12387 0.034127 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 0.109374 Adj. R2: 0.052465