Learn R Programming

rddensity (version 1.0)

rddensity: Manipulation Testing Using Local Polynomial Density Estimation

Description

rddensity implements manipulation testing procedures using the local polynomial density estimator proposed in Cattaneo, Jansson and Ma (2019). For a review on manipulation testing see McCrary (2008).

Companion command: rdbwdensity for data-driven bandwidth selection, and rdplotdensity for density plot. A companion Stata package is described in Cattaneo, Jansson and Ma (2018).

Related Stata and R packages useful for inference in regression discontinuity (RD) designs are described at https://sites.google.com/site/rdpackages.

Usage

rddensity(X, c = 0, p = 2, q = 0, kernel = "", fitselect = "",
  h = c(), bwselect = "", vce = "", all = FALSE)

Arguments

X

Numeric vector or one dimensional matrix / data frame, the running variable.

c

Numeric, specifies the threshold or cutoff value in the support of X, which determes the two samples (e.g., control and treatment units in RD settings). Default is 0.

p

Integer, specifies the order of the local-polynomial used to construct the density point estimators. Default is 2 (local quadratic approximation).

q

Integer, specifies the order of the local-polynomial used to construct the bias-corrected density point estimators. Default is p+1 (local cubic approximation).

kernel

String, specifies the kernel function used to construct the local-polynomial estimator(s). Options are: "triangular", "epanechnikov", and "uniform". Default is "triangular".

fitselect

String, specifies whether restrictions should be imposed. Options are: "unrestricted" for density estimation without any restrictions (two-sample, unrestricted inference). This is the default option. "restricted" for density estimation assuming equal c.d.f. and higher-order derivatives.

h

Numeric, specifies the bandwidth used to construct the density estimators on the two sides of the cutoff. If not specified, the bandwidth is computed by the companion command rdbwdensity. If two bandwidths are specified, the first bandwidth is used for the data below the cutoff and the second bandwidth is used for the data above the cutoff.

bwselect

String, specifies the bandwidth selection procedure to be used. Options are: "each" bandwidth selection based on MSE of each density separately (two distinct bandwidths). "diff" bandwidth selection based on MSE of difference of densities (one common bandwidth). "sum" bandwidth selection based on MSE of sum of densities (one common bandwidth). "comb" (this is the default option) bandwidth is selected as a combination of the alternatives above: for fitselect = "unrestricted", it selects median(each,diff,sum); for fitselect = "restricted", it selects min(diff,sum).

vce

String, specifies the procedure used to compute the variance-covariance matrix estimator. Options are: "plugin" for asymptotic plug-in standard errors. "jackknife" for jackknife standard errors. This is the default option.

all

Boolean, if specified, rddensity reports two testing procedures (given choices fitselect and bwselect): Conventional test statistic (not valid when using MSE-optimal bandwidth choice). Robust bias-corrected statistic.

Value

hat

left/right: density estimate to the left/right of cutoff; diff: difference in estimated densities on the two sides of cutoff.

sd_asy

left/right: standard error for the estimated density to the left/right of the cutoff; diff: standard error for the difference in estimated densities. (Based on asymptotic formula.)

sd_jk

left/right: standard error for the estimated density to the left/right of the cutoff; diff: standard error for the difference in estimated densities. (Based on the jackknife method.)

test

t_asy/t_jk: t-statistic for the density discontinuity test, with standard error based on asymptotic formula or the jackknife; p_asy/p_jk: p-value for the density discontinuity test, with standard error based on asymptotic formula or the jackknife.

hat_p

Same as hat, without bias correction (only available when all=TRUE).

sd_asy_p

Same as sd_asy, without bias correction (only available when all=TRUE).

sd_jk_p

Same as sd_jk, without bias correction (only available when all=TRUE).

test_p

Same as test, without bias correction (only available when all=TRUE).

N

full: full sample size; left/right: sample size to the left/right of the cutoff; eff_left/eff_right: effective sample size to the left/right of the cutoff (this depends on the bandwidth).

h

left/right: bandwidth used to the left/right of the cutoff.

opt

Collects the options used, includes: fitselect, kernel, bwselectl, bwselect, hscale, vce, c, p, q, all. See options for rddensity.

X_min

left/right: the samllest observation to the left/right of the cutoff.

X_max

left/right: the largest observation to the left/right of the cutoff.

References

M.D. Cattaneo, M. Jansson and X. Ma. (2018). Manipulation Testing based on Density Discontinuity. Stata Journal 18(1): 234-261.

M.D. Cattaneo, M. Jansson and X. Ma. (2019). Simple Local Polynomial Density Estimators. Journal of the American Statistical Association, forthcoming.

J. McCrary. (2008). Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test. Journal of Econometrics 142(2): 698-714.

See Also

rdbwdensity, rdplotdensity

Examples

Run this code
# NOT RUN {
# Continuous Density
set.seed(42)
x <- rnorm(2000, mean = -0.5)
summary(rddensity(X = x, vce="jackknife"))

# Discontinuous density
x[x>0] <- x[x>0] * 2
summary(rddensity(X = x, vce="jackknife"))

# }

Run the code above in your browser using DataLab