lpbwdensity
implements the bandwidth selection methods for local
polynomial based density (and derivatives) estimation proposed and studied
in Cattaneo, Jansson and Ma (2020, 2023).
See Cattaneo, Jansson and Ma (2022) for more implementation details and illustrations.
Companion command: lpdensity
for estimation and robust bias-corrected inference.
Related Stata
and R
packages useful for nonparametric estimation and inference are
available at https://nppackages.github.io/.
lpbwdensity(
data,
grid = NULL,
p = NULL,
v = NULL,
kernel = c("triangular", "uniform", "epanechnikov"),
bwselect = c("mse-dpi", "imse-dpi", "mse-rot", "imse-rot"),
massPoints = TRUE,
stdVar = TRUE,
regularize = TRUE,
nLocalMin = NULL,
nUniqueMin = NULL,
Cweights = NULL,
Pweights = NULL
)
A matrix containing (1) grid
(grid point), (2) bw
(bandwidth),
(3) nh
(number of observations in each local neighborhood), and
(4) nhu
(number of unique observations in each local neighborhood).
A list containing options passed to the function.
Numeric vector or one dimensional matrix/data frame, the raw data.
Numeric, specifies the grid of evaluation points. When set to default, grid points will be chosen as 0.05-0.95 percentiles of the data, with a step size of 0.05.
Nonnegative integer, specifies the order of the local polynomial used to construct point
estimates. (Default is 2
.)
Nonnegative integer, specifies the derivative of the distribution function to be estimated. 0
for
the distribution function, 1
(default) for the density funtion, etc.
String, specifies the kernel function, should be one of "triangular"
, "uniform"
or
"epanechnikov"
.
String, specifies the method for data-driven bandwidth selection. This option will be
ignored if bw
is provided. Can be (1) "mse-dpi"
(default, mean squared error-optimal
bandwidth selected for each grid point); or (2) "imse-dpi"
(integrated MSE-optimal bandwidth,
common for all grid points); (3) "mse-rot"
(rule-of-thumb bandwidth with Gaussian
reference model); and (4) "imse-rot"
(integrated rule-of-thumb bandwidth with Gaussian
reference model).
TRUE
(default) or FALSE
, specifies whether point estimates and standard errors
should be adjusted if there are mass points in the data.
TRUE
(default) or FALSE
, specifies whether the data should be standardized for
bandwidth selection.
TRUE
(default) or FALSE
, specifies whether the bandwidth should be
regularized. When set to TRUE
, the bandwidth is chosen such that the local region includes
at least nLocalMin
observations and at least nUniqueMin
unique observations.
Nonnegative integer, specifies the minimum number of observations in each local neighborhood. This option
will be ignored if regularize=FALSE
. Default is 20+p+1
.
Nonnegative integer, specifies the minimum number of unique observations in each local neighborhood. This option
will be ignored if regularize=FALSE
. Default is 20+p+1
.
Numeric vector, specifies the weights used
for counterfactual distribution construction. Should have the same length as the data.
This option will be ignored if bwselect
is "mse-rot"
or "imse-rot"
.
Numeric vector, specifies the weights used
in sampling. Should have the same length as the data.
This option will be ignored if bwselect
is "mse-rot"
or "imse-rot"
.
Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu.
Michael Jansson, University of California Berkeley. mjansson@econ.berkeley.edu.
Xinwei Ma (maintainer), University of California San Diego. x1ma@ucsd.edu.
Cattaneo, M. D., M. Jansson, and X. Ma. 2020. Simple Local Polynomial Density Estimators. Journal of the American Statistical Association, 115(531): 1449-1455. tools:::Rd_expr_doi("10.1080/01621459.2019.1635480")
Cattaneo, M. D., M. Jansson, and X. Ma. 2022. lpdensity: Local Polynomial Density Estimation and Inference. Journal of Statistical Software, 101(2), 1–25. tools:::Rd_expr_doi("10.18637/jss.v101.i02")
Cattaneo, M. D., M. Jansson, and X. Ma. 2023. Local Regression Distribution Estimators. Journal of Econometrics, forthcoming. tools:::Rd_expr_doi("10.1016/j.jeconom.2021.01.006")
Supported methods: coef.lpbwdensity
, print.lpbwdensity
, summary.lpbwdensity
.
# Generate a random sample
set.seed(42); X <- rnorm(2000)
# Construct bandwidth
bw1 <- lpbwdensity(X)
summary(bw1)
# Display bandwidths for a subset of grid points
summary(bw1, grid=bw1$BW[4:10, "grid"])
summary(bw1, gridIndex=4:10)
Run the code above in your browser using DataLab