ivnr: Instrument-based treatment evaluation under endogeneity and non-response bias

Description

Non- and semiparaemtric treatment effect estimation under treatment endogeneity and selective non-response in the outcome based on a binary instrument for the treatment and a continous instrument for response.

Usage

ivnr(
  y,
  d,
  r,
  z1,
  z2,
  x = NULL,
  xpar = NULL,
  ruleofthumb = 1,
  wgtfct = 2,
  rtype = "ll",
  numresprob = 20,
  boot = 499,
  estlate = TRUE,
  trim = 0.01
)

Value

A ivnr object contains one output component:

output: The first row provides the effect estimates under non- and semi-parametric estimation using both instruments, see "nonpara (L)ATE IV" and "semipara (L)ATE IV" as well as under a missing-at-random assumption for response when using only the first instrument for the treatment, see "nonpara (L)ATE MAR" and "semipara (L)ATE MAR". The second row provides the standard errors based on bootstrapping the effects. The third row provides the p-values based on the t-statistics.

Arguments

y: Dependent variable.
d: Treatment, must be binary and must not contain missings.
r: Response, must be a binary indicator for whether the outcome is observed.
z1: Binary instrument for the treatment, must not contain missings.
z2: Continuous instrument for response, must not contain missings.
x: A data frame of covariates to be included in the nonparametric estimation, must not contain missings. Factors and ordered varaibles must be appropriately defined as such by factor() and ordered(). Default is NULL (no covariates included). Covariates are only considered if both x and xpar are not NULL.
xpar: Covariates to be included in the semiparametric estimation, must not contain missings. Default is NULL (no covariates included). Covariates are only considered if both x and xpar are not NULL.
ruleofthumb: If 1, bandwidth selection in any kernel function is based on the Silverman (1986) rule of thumb. Otherwise, least squares cross-validation is used. Default is 1.
wgtfct: Weighting function to be used in effect estimation. If set to 1, equation (18) in Fricke et al (2020) is used as weight. If set to 2, equation (19) in Fricke et al (2020) is used as weight. If set to 3, the median of LATEs across values of response probabilities numresprob is used. Default is 2.
rtype: Regression type used for continuous outcomes in the kernel regressions. Either "ll" for local linear or "lc" for local constant regression. Default is "ll".
numresprob: number of response probabilities at which the effects are evaluated. An equidistant grid is constructed based on the number provided. Default is 20.
boot: Number of bootstrap replications for estimating standard errors of the effects. Default is 499.
estlate: If set to TRUE the local average treatment effect on compliers (LATE) is estimated, otherwise the average treatment effect (ATE) is estimated. Default is TRUE.
trim: Trimming rule for too extreme denominators in the weighting functions or inverses of products of conditional treatment probabilities. Values below trim are set to trim to avoid values that are too close to zero in any denominator. Default is 0.01.

Details

Non- and semiparametric treatment effect estimation under treatment endogeneity and selective non-response in the outcome based on a binary instrument for the treatment and a continuous instrument for response. The effects are estimated both semi-parametrically (using probit and OLS for the estimation of plug-in parameters like conditional probabilities and outcomes) and fully non-parametrically (based on kernel regression for any conditional probability/mean). Besides the instrument-based estimates, results are also presented under a missing-at-random assumption (MAR) when not using the instrument z2 for response (but only z1 for the treatment). See Fricke et al. (2020) for further details.

References

Fricke, H., Frölich, M., Huber, M., Lechner, M. (2020): "Endogeneity and non-response bias in treatment evaluation - nonparametric identification of causal effects by instruments", Journal of Applied Econometrics, forthcoming.

Examples

Run this code

# A little example with simulated data (1000 observations)
if (FALSE) {
n=1000          # sample size
e<-(rmvnorm(n,rep(0,3), matrix(c(1,0.5,0.5,  0.5,1,0.5,  0.5,0.5,1),3,3)))
# correlated error term of treatment, response, and outcome equation
x=runif(n,-0.5,0.5)           # observed confounder
z1<-(-0.25*x+rnorm(n)>0)*1    # binary instrument for treatment
z2<- -0.25*x+rnorm(n)         # continuous instrument for selection
d<-(z1-0.25*x+e[,1]>0)*1      # treatment equation
 y_star<- -0.25*x+d+e[,2]     # latent outcome
 r<-(-0.25*x+z2+d+e[,3]>0)*1  # response equation
 y=y_star                     # observed outcome
 y[r==0]=0                    # nonobserved outcomes are set to zero
 # The true treatment effect is 1
 ivnr(y=y,d=d,r=r,z1=z1,z2=z2,x=x,xpar=x,numresprob=4,boot=39)}

Run the code above in your browser using DataLab