rd2d.dist: Local Polynomial RD Estimation on Distance-Based Running Variables

Description

rd2d.dist implements distance-based local polynomial boundary regression discontinuity (RD) point estimators with robust bias-corrected pointwise confidence intervals and uniform confidence bands, developed in Cattaneo, Titiunik and Yu (2025a) with a companion software article Cattaneo, Titiunik and Yu (2025b). For robust bias-correction, see Calonico, Cattaneo, Titiunik (2014).

Companion commands are: rdbw2d.dist for data-driven bandwidth selection.

For other packages of RD designs, visit https://rdpackages.github.io/

Usage

rd2d.dist(
  Y,
  D,
  h = NULL,
  b = NULL,
  p = 1,
  q = 2,
  kink = c("off", "on"),
  kernel = c("tri", "triangular", "epa", "epanechnikov", "uni", "uniform", "gau",
    "gaussian"),
  level = 95,
  cbands = TRUE,
  side = c("two", "left", "right"),
  repp = 1000,
  bwselect = c("mserd", "imserd", "msetwo", "imsetwo", "user provided"),
  vce = c("hc1", "hc0", "hc2", "hc3"),
  rbc = c("on", "off"),
  bwcheck = 50 + p + 1,
  masspoints = c("check", "adjust", "off"),
  C = NULL,
  scaleregul = 1,
  cqt = 0.5
)

Value

An object of class "rd2d.dist", a list containing:

results

A data frame with point estimates, variances, p-values, confidence intervals, confidence bands, and bandwidths at each evaluation point.

b1: First coordinate of the evaluation point.
b2: Second coordinate of the evaluation point.
Est.p: Point estimate of \(\widehat{\tau}_{\text{dist},p}(\mathbf{b})\) with polynomial order \(p\).
Se.p: Standard error of \(\widehat{\tau}_{\text{dist},p}(\mathbf{b})\).
Est.q: Bias-corrected estimate \(\widehat{\tau}_{\text{dist},q}(\mathbf{b})\) with polynomial order \(q\).
Se.q: Standard error of \(\widehat{\tau}_{\text{dist},q}(\mathbf{b})\).
pvalue: Two-sided p-value based on \(T_{\text{dist},q}(\mathbf{b})\).
CI.lower: Lower bound of confidence interval.
CI.upper: Upper bound of confidence interval.
CB.lower: Lower bound of uniform confidence band (if cbands=TRUE).
CB.upper: Upper bound of uniform confidence band (if cbands=TRUE).
h0: Bandwidth used for control group (\(D_i(\mathbf{b}) < 0\)).
h1: Bandwidth used for treatment group (\(D_i(\mathbf{b}) \geq 0\)).
Nh0: Effective sample size for control group.
Nh1: Effective sample size for treatment group.

results.A0

Same structure as results but for control group outcomes.

results.A1

Same structure as results but for treatment group outcomes.

tau.hat

Vector of point estimates \(\widehat{\tau}_p(\mathbf{b})\).

se.hat

Standard errors corresponding to \(\widehat{\tau}_p(\mathbf{b})\).

cb

Confidence intervals and uniform bands.

cov.q

Covariance matrix for bias-corrected estimates \(\widehat{\tau}_{\text{dist},q}(\mathbf{b})\) for all point evaluations \(\mathbf{b}\).

opt

List of options used in the function call.

Arguments

Y

Dependent variable; a numeric vector of length \(N\), where \(N\) is the sample size.

D

Distance-based scores \(\mathbf{D}_i=(\mathbf{D}_{i}(\mathbf{b}_1),\cdots,\mathbf{D}_{i}(\mathbf{b}_J))\); dimension is \(N \times J\) where \(N\) = sample size and \(J\) = number of cutoffs; non-negative values means data point in treatment group and negative values means data point in control group.

h

Bandwidth(s); if \(c=h\) then same bandwidth is used for both groups; if a matrix of size \(J \times 2\) is provided, each row contains \((h_{\text{control}}, h_{\text{tr}})\) for the evaluation point; if not specified, bandwidths are selected via rdbw2d.dist().

b

Optional evaluation points; a matrix or data frame specifying boundary points \(\mathbf{b}_j = (b_{1j}, b_{2j})\), dimension \(J \times 2\).

p

Polynomial order for point estimation. Default is p = 1.

q

Polynomial order for bias-corrected estimation. Must satisfy \(q \geq p\). Default is q = p + 1.

kink

Logical; whether to apply kink adjustment. Options: "on" or "off" (default).

kernel

Kernel function to use. Options are "unif", "uniform" (uniform), "triag", "triangular" (triangular, default), and "epan", "epanechnikov" (Epanechnikov).

level

Nominal confidence level for intervals/bands, between 0 and 100 (default is 95).

cbands

Logical. If TRUE, also compute uniform confidence bands (default is FALSE).

side

Type of confidence interval. Options: "two" (two-sided, default), "left" (left tail), or "right" (right tail).

repp

Number of bootstrap repetitions used for critical value simulation. Default is 1000.

bwselect

Bandwidth selection strategy. Options:

"mserd". One common MSE-optimal bandwidth selector for the boundary RD treatment effect estimator for each evaluation point (default).
"imserd". IMSE-optimal bandwidth selector for the boundary RD treatment effect estimator based on all evaluation points.
"msetwo". Two different MSE-optimal bandwidth selectors (control and treatment) for the boundary RD treatment effect estimator for each evaluation point.
"imsetwo". Two IMSE-optimal bandwidth selectors (control and treatment) for the boundary RD treatment effect estimator based on all evaluation points.
"user provided". User-provided bandwidths. If h is not NULL, then bwselect is overwritten to "user provided".

vce

Variance-covariance estimator for standard errors. Options:

"hc0": Heteroskedasticity-robust variance estimator without small sample adjustment (White robust).

"hc1"

Heteroskedasticity-robust variance estimator with degrees-of-freedom correction (default).

"hc2"

Heteroskedasticity-robust variance estimator using leverage adjustments.

"hc3"

More conservative heteroskedasticity-robust variance estimator (similar to jackknife correction).

rbc

Logical. Whether to apply robust bias correction. Options: "on" (default) or "off". When kink = off, turn on rbc means setting q to p + 1. When kink = on, turn on rbc means shrinking the bandwidth selector to be proportional to \(N^{-1/3}\).

bwcheck

If a positive integer is provided, the preliminary bandwidth used in the calculations is enlarged so that at least bwcheck observations are used. The program stops with “not enough observations” if sample size \(N\) < bwcheck. Default is 50 + p + 1.

masspoints

Strategy for handling mass points in the running variable. Options:

"check": Check for repeated values and adjust inference if needed (default).

"adjust"

Adjust bandwidths to guarantee a sufficient number of unique support points.

"off"

Ignore mass points completely.

Cluster ID variable used for cluster-robust variance estimation. Default is C = NULL.

scaleregul

Scaling factor for the regularization term in bandwidth selection. Default is 1.

cqt

Constant controlling subsample fraction for initial bias estimation. Default is 0.5.

Author

Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu
Rocío Titiunik, Princeton University. titiunik@princeton.edu
Ruiqi Rae Yu, Princeton University. rae.yu@princeton.edu

Details

MSE bandwidth selection for geometrical RD design

References

Cattaneo, M. D., Titiunik, R., Yu, R. R. (2025a). Estimation and Inference in Boundary Discontinuity Designs
Cattaneo, M. D., Titiunik, R., Yu, R. R. (2025b). rd2d: Causal Inference in Boundary Discontinuity Designs
Calonico, S., Cattaneo, M. D., Titiunik, R. (2014) Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs

Examples

Run this code

set.seed(123)
n <- 5000

# Generate running variables x1 and x2
x1 <- rnorm(n)
x2 <- rnorm(n)

# Define treatment assignment: treated if x1 >= 0
d <- as.numeric(x1 >= 0)

# Generate outcome variable y with some treatment effect
y <- 3 + 2 * x1 + 1.5 * x2 + 1.5 * d + rnorm(n, sd = 0.5)

# Define evaluation points (e.g., at the origin and another point)
eval <- data.frame(x.1 = c(0, 0), x.2 = c(0, 1))

# Compute Euclidean distances to evaluation points
dist.a <- sqrt((x1 - eval$x.1[1])^2 + (x2 - eval$x.2[1])^2)
dist.b <- sqrt((x1 - eval$x.1[2])^2 + (x2 - eval$x.2[2])^2)

# Combine distances into a matrix
D <- as.data.frame(cbind(dist.a, dist.b))

# Assign positive distances for treatment group, negative for control
d_expanded <- matrix(rep(2 * d - 1, times = ncol(D)), nrow = nrow(D), ncol = ncol(D))
D <- D * d_expanded

# Run the rd2d.dist function
result <- rd2d.dist(y, D, b = eval)

# View the estimation results
print(result)
summary(result)

Run the code above in your browser using DataLab