did2s: Calculate two-stage difference-in-differences following Gardner (2021)

Description

Calculate two-stage difference-in-differences following Gardner (2021)

Usage

did2s(
  data,
  yname,
  first_stage,
  second_stage,
  treatment,
  cluster_var,
  weights = NULL,
  bootstrap = FALSE,
  n_bootstraps = 250,
  return_bootstrap = FALSE,
  verbose = FALSE
)

Value

fixest object with adjusted standard errors (either by formula or by bootstrap). All the methods from fixest package will work, including fixest::esttable and fixest::coefplot

Arguments

data: The dataframe containing all the variables
yname: Outcome variable
first_stage: Fixed effects and other covariates you want to residualize with in first stage. Formula following fixest::feols. Fixed effects specified after "|".
second_stage: Second stage, these should be the treatment indicator(s) (e.g. treatment variable or event-study leads/lags). Formula following fixest::feols. Use i() for factor variables, see fixest::i.
treatment: A variable that = 1 if treated, = 0 otherwise. The first stage will be estimated for treatment == 0. The second stage will be estimated for the full sample.
cluster_var: What variable to cluster standard errors. This can be IDs or a higher aggregate level (state for example)
weights: Optional. Variable name for regression weights.
bootstrap: Optional. Should standard errors be calculated using bootstrap? Default is FALSE.
n_bootstraps: Optional. How many bootstraps to run. Default is 250.
return_bootstrap: Optional. Logical. Will return each bootstrap second-stage estimate to allow for manual use, e.g. percentile standard errors and empirical confidence intervals.
verbose: Optional. Logical. Should information about the two-stage procedure be printed back to the user? Default is TRUE.

Examples

Load example dataset which has two treatment groups and homogeneous treatment effects

# Load Example Dataset
data("df_hom")

Static TWFE

You can run a static TWFE fixed effect model for a simple treatment indicator

static <- did2s(df_hom,
    yname = "dep_var", treatment = "treat", cluster_var = "state",
    first_stage = ~ 0 | unit + year,
    second_stage = ~ i(treat, ref=FALSE))
fixest::esttable(static)
#>                            static
#> Dependent Var.:           dep_var
#>                                  
#> treat = TRUE    2.005*** (0.0202)
#> _______________ _________________
#> S.E.: Clustered         by: state
#> Observations               46,500
#> R2                        0.47520
#> Adj. R2                   0.47520
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Event Study

Or you can use relative-treatment indicators to estimate an event study estimate

es <- did2s(df_hom,
    yname = "dep_var", treatment = "treat", cluster_var = "state",
    first_stage = ~ 0 | unit + year,
    second_stage = ~ i(rel_year, ref=c(-1, Inf)))
fixest::esttable(es)
#>                                es
#> Dependent Var.:           dep_var
#>                                  
#> rel_year = -20    0.0043 (0.0322)
#> rel_year = -19    0.0222 (0.0296)
#> rel_year = -18   -0.0358 (0.0308)
#> rel_year = -17    0.0043 (0.0337)
#> rel_year = -16   -0.0186 (0.0353)
#> rel_year = -15   -0.0045 (0.0346)
#> rel_year = -14   -0.0393 (0.0384)
#> rel_year = -13    0.0453 (0.0323)
#> rel_year = -12    0.0324 (0.0309)
#> rel_year = -11   -0.0245 (0.0349)
#> rel_year = -10   -0.0017 (0.0241)
#> rel_year = -9     0.0155 (0.0242)
#> rel_year = -8    -0.0073 (0.0210)
#> rel_year = -7   -0.0513* (0.0202)
#> rel_year = -6     0.0269 (0.0237)
#> rel_year = -5     0.0136 (0.0237)
#> rel_year = -4    0.0381. (0.0223)
#> rel_year = -3    -0.0228 (0.0284)
#> rel_year = -2     0.0041 (0.0228)
#> rel_year = 0    1.971*** (0.0470)
#> rel_year = 1    2.050*** (0.0466)
#> rel_year = 2    2.033*** (0.0441)
#> rel_year = 3    1.966*** (0.0400)
#> rel_year = 4    1.965*** (0.0430)
#> rel_year = 5    2.030*** (0.0456)
#> rel_year = 6    2.040*** (0.0447)
#> rel_year = 7    1.995*** (0.0370)
#> rel_year = 8    2.019*** (0.0485)
#> rel_year = 9    1.955*** (0.0468)
#> rel_year = 10   1.950*** (0.0455)
#> rel_year = 11   2.117*** (0.0664)
#> rel_year = 12   2.132*** (0.0741)
#> rel_year = 13   2.019*** (0.0640)
#> rel_year = 14   2.013*** (0.0522)
#> rel_year = 15   1.961*** (0.0605)
#> rel_year = 16   1.916*** (0.0584)
#> rel_year = 17   1.938*** (0.0607)
#> rel_year = 18   2.070*** (0.0666)
#> rel_year = 19   2.066*** (0.0609)
#> rel_year = 20   1.964*** (0.0612)
#> _______________ _________________
#> S.E.: Clustered         by: state
#> Observations               46,500
#> R2                        0.47577
#> Adj. R2                   0.47533
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# plot rel_year coefficients and standard errors
fixest::coefplot(es, keep = "rel_year::(.*)")

Example from Cheng and Hoekstra (2013)

Here's an example using data from Cheng and Hoekstra (2013)

# Castle Data
castle <- haven::read_dta("https://github.com/scunning1975/mixtape/raw/master/castle.dta")
did2s(
	data = castle,
	yname = "l_homicide",
	first_stage = ~ 0 | sid + year,
	second_stage = ~ i(post, ref=0),
	treatment = "post",
	cluster_var = "state", weights = "popwt"
)
#> OLS estimation, Dep. Var.: l_homicide
#> Observations: 550
#> Weights: weights_vector
#> Standard-errors: Corrected Clustered (state) 
#>         Estimate Std. Error t value Pr(>|t|)    
#> post::1 0.075142    0.03538 2.12387 0.034127 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 0.109374   Adj. R2: 0.052465