TSHT: Two-Stage Hard Thresholding

Description

Perform Two-Stage Hard Thresholding method, which provides the robust inference of the treatment effect in the presence of invalid instrumental variables.

Usage

TSHT(
  Y,
  D,
  Z,
  X,
  intercept = TRUE,
  method = c("OLS", "DeLasso", "Fast.DeLasso"),
  voting = c("MaxClique", "MP", "Conservative"),
  robust = TRUE,
  alpha = 0.05,
  tuning.1st = NULL,
  tuning.2nd = NULL
)

Value

TSHT returns an object of class "TSHT", which is a list containing the following components:

betaHat: The estimate of treatment effect.
beta.sdHat: The estimated standard error of betaHat.
ci: The 1-alpha confidence interval for beta.
SHat: The set of selected relevant IVs.
VHat: The set of selected relevant and valid IVs.
voting.mat: The voting matrix.
check: The indicator that the majority rule is satisfied.

Arguments

Y: The outcome observation, a vector of length \(n\).
D: The treatment observation, a vector of length \(n\).
Z: The instrument observation of dimension \(n \times p_z\).
X: The covariates observation of dimension \(n \times p_x\).
intercept: Whether the intercept is included. (default = TRUE)
method: The method used to estimate the reduced form parameters. "OLS" stands for ordinary least squares, "DeLasso" stands for the debiased Lasso estimator, and "Fast.DeLasso" stands for the debiased Lasso estimator with fast algorithm. (default = "OLS")
voting: The voting option used to estimate valid IVs. 'MP' stands for majority and plurality voting, 'MaxClique' stands for finding maximal clique in the IV voting matrix, and 'Conservative' stands for conservative voting procedure. Conservative voting is used to get an initial estimator of valid IVs in the Searching-Sampling method. (default= 'MaxClique').
robust: If TRUE, the method is robust to heteroskedastic errors. If FALSE, the method assumes homoskedastic errors. (default = TRUE)
alpha: The significance level for the confidence interval. (default = 0.05)
tuning.1st: The tuning parameter used in the 1st stage to select relevant instruments. If NULL, it will be generated data-dependently, see Details. (default=NULL)
tuning.2nd: The tuning parameter used in the 2nd stage to select valid instruments. If NULL, it will be generated data-dependently, see Details. (default=NULL)

Details

When robust = TRUE, the method will be input as ’OLS’. When voting = MaxClique and there are multiple maximum cliques, betaHat,beta.sdHat,ci, and VHat will be list objects where each element of list corresponds to each maximum clique. As for tuning parameter in the 1st stage and 2nd stage, if do not specify, for method "OLS" we adopt \(\sqrt{\log n}\) for both tuning parameters, and for other methods we adopt \(\max{(\sqrt{2.01 \log p_z}, \sqrt{\log n})}\) for both tuning parameters.

References

Guo, Z., Kang, H., Tony Cai, T. and Small, D.S. (2018), Confidence intervals for causal effects with invalid instruments by using two-stage hard thresholding with voting, J. R. Stat. Soc. B, 80: 793-815.

Examples

Run this code

data("lineardata")
Y <- lineardata[,"Y"]
D <- lineardata[,"D"]
Z <- as.matrix(lineardata[,c("Z.1","Z.2","Z.3","Z.4","Z.5","Z.6","Z.7","Z.8")])
X <- as.matrix(lineardata[,c("age","sex")])
TSHT.model <- TSHT(Y=Y,D=D,Z=Z,X=X)
summary(TSHT.model)

Run the code above in your browser using DataLab