Calculate estimators and bias-aware CIs for the sharp or fuzzy RD parameter, or for value of the conditional mean at a point.
RDHonest(
formula,
data,
subset,
weights,
cutoff = 0,
M,
kern = "triangular",
na.action,
opt.criterion = "MSE",
h,
se.method = "nn",
alpha = 0.05,
beta = 0.8,
J = 3,
sclass = "H",
T0 = 0,
point.inference = FALSE,
sigmaY2,
sigmaD2,
sigmaYD,
clusterid
)Returns an object of class "RDResults". The function
print can be used to obtain and print a summary of the results. An
object of class "RDResults" is a list containing four components.
First, a data frame "coefficients" containing the following
columns:
termtype of parameter being estimated
estimatepoint estimate
std.errorstandard error of estimate
maximum.biasmaximum bias of estimate
conf.low, conf.highlower (upper) end-point of a
two-sided CI based on estimate
conf.low.onesided, conf.high.onesidedlower (upper)
end-point of a one-sided CIs based on estimate
bandwidthbandwidth used. If kern="optimal", the
smoothing parameters bandwidth.m and bandwidth.p on
either side of the cutoff are reported instead
eff.obsnumber of effective observations
leveragemaximal leverage of estimate
cvcritical value used to compute two-sided CIs
alphacoverage level, as specified by option alpha
methodsclass is used
Mcurvature bound used for worst-case bias
calculations. For fuzzy RD, equals
(abs(estimate)*M.fs+M.rf)/first.stage
M.rf, M.fscurvature bound for the outcome (i.e. reduced-form) and first-stage regressions. Fuzzy RD only.
first.stageestimate of the first-stage coefficient. Fuzzy RD only.
kernelkernel used
p.valuep-value for testing the null of no effect
Second, a list called "data" containing the data used for
estimation. This is useful mostly for internal calculations. Third, an
object of class "lm" containing the local linear regression
estimates. Finally, a call object containing the matched call
called "call".
If kern="optimal", the "lm" object is empty, and the
numeric vectors "delta" and "omega" are returned in
addition. These correspond to the parameters in the modulus problem used
to compute the optimal estimation weights.
an object of class "formula" (or one that can be
coerced to that class). The formula syntax is outcome ~
running_variable for inference at a point. For sharp RD, it is
outcome ~ running_variable if there are no covariates, or
outcome ~ running_variable | covariates if covariates are present.
For fuzzy RD, it is outcome | treatment ~ running_variable |
covariates, with covariates optional.
optional data frame, list or environment (or object coercible by
as.data.frame to a data frame) containing the outcome and running
variables in the model. If not found in data, the variables are
taken from environment(formula), typically the environment from
which the function is called.
optional vector specifying a subset of observations to be used in the fitting process.
Optional vector of weights to weight the observations (useful for aggregated data). The weights are interpreted as the number of observations that each aggregated data point averages over. Disregarded if optimal kernel is used.
specifies the RD cutoff in the running variable. For inference at a point, specifies the point \(x_0\) at which to calculate the conditional mean.
Bound on second derivative of the conditional mean function, a
numeric vector of length one. For fuzzy RD, M needs to be a
numeric vector of length two, specifying the smoothness of the
conditional mean for the outcome and treatment, respectively.
specifies the kernel function used in the local regression. It
can either be a string equal to "triangular"
(\(k(u)=(1-|u|)_{+}\)), "epanechnikov"
(\(k(u)=(3/4)(1-u^2)_{+}\)), or "uniform" (\(k(u)=
(|u|<1)/2\)), or else a kernel function. If equal to "optimal", use
the finite-sample optimal linear estimator under Taylor smoothness class,
instead of a local linear estimator.
function which indicates what should happen when the data
contain NAs. The default is set by the na.action setting of
options (usually na.omit). Another possible value is
na.fail
Optimality criterion that the bandwidth is designed to optimize. The options are:
"MSE"Finite-sample maximum MSE
"FLCI"Length of (fixed-length) two-sided confidence intervals.
"OCI"Given quantile of excess length of one-sided confidence intervals
The methods use conditional variance given by sigmaY2, if
supplied. For fuzzy RD, sigmaD2 and sigmaYD also need to be
supplied in this case. Otherwise, the methods use preliminary variance
estimates based on assuming homoskedasticity on either side of the
cutoff.
bandwidth, a scalar parameter. If not supplied, optimal bandwidth is
computed according to criterion given by opt.criterion.
method for estimating standard error of the estimate, one of:
Nearest neighbor method
Eicker-Huber-White, with residuals from local regression (local polynomial estimators only).
Use conditional variance supplied by sigmaY2
instead of computing residuals. For fuzzy RD, sigmaD2 and
sigmaYD also need to be supplied in this case.
determines confidence level, 1-alpha for
constructing/optimizing confidence intervals.
Determines quantile of excess length to optimize, if bandwidth
optimizes given quantile of excess length of one-sided confidence
intervals (opt.criterion="OCI"); otherwise ignored.
Number of nearest neighbors, if se.method="nn" is specified.
Otherwise ignored.
Smoothness class, either "T" for Taylor or "H"
for Hölder class.
Initial estimate of the treatment effect for calculating the optimal bandwidth. Only relevant for fuzzy RD.
Do inference at a point determined by cutoff
instead of RD.
Supply variance of outcome. Ignored when kernel is optimal.
Supply variance of treatment (fuzzy RD only).
Supply covariance of treatment and outcome (fuzzy RD only).
Vector specifying cluster membership. If supplied,
se.method="EHW" is required, and standard errors use
cluster-robust variance formulas.
The bandwidth is calculated to be optimal for a given performance criterion,
as specified by opt.criterion. Alternatively, for local polynomial
estimators, the bandwidth can be specified by h. For
kern="optimal", calculate optimal estimators under second-order Taylor
smoothness class (sharp RD only).
Timothy B. Armstrong and Michal Kolesár. Optimal inference in a class of regression models. Econometrica, 86(2):655–683, March 2018. tools:::Rd_expr_doi("10.3982/ECTA14434")Timothy B. Armstrong and Michal Kolesár. Simple and honest confidence intervals in nonparametric regression. Quantitative Economics, 11(1):1–39, January 2020. tools:::Rd_expr_doi("10.3982/QE1199")Michal Kolesár and Christoph Rothe. Inference in regression discontinuity designs with a discrete running variable. American Economic Review, 108(8):2277—-2304, August 2018. tools:::Rd_expr_doi("10.1257/aer.20160945")
RDHonest(voteshare ~ margin, data = lee08, kern = "uniform", M = 0.1, h = 10)
RDHonest(cn | retired ~ elig_year, data=rcp, cutoff=0, M=c(4, 0.4),
kern="triangular", opt.criterion="MSE", T0=0, h=3)
RDHonest(voteshare ~ margin, data = lee08, subset = margin>0,
kern = "uniform", M = 0.1, h = 10, point.inference=TRUE)
Run the code above in your browser using DataLab