lspkselect: Tuning Parameter Selection Procedures for Partitioning-Based Regression Estimation and Inference

Description

lspkselect implements data-driven procedures to select the Integrated Mean Squared Error (IMSE) optimal number of partitioning knots for partitioning-based least squares regression estimators. Three series methods are supported: B-splines, compactly supported wavelets, and piecewise polynomials. See Cattaneo and Farrell (2013) and Cattaneo, Farrell and Feng (2019a) for complete details.

Companion commands: lsprobust for partitioning-based least squares regression estimation and inference; lsprobust.plot for plotting results; lsplincom for multiple sample estimation and inference.

A detailed introduction to this command is given in Cattaneo, Farrell and Feng (2019b).

For more details, and related Stata and R packages useful for empirical analysis, visit https://sites.google.com/site/nppackages/.

Usage

lspkselect(y, x, m = NULL, m.bc = NULL, smooth = NULL,
  bsmooth = NULL, deriv = NULL, method = "bs", ktype = "uni",
  kselect = "imse-dpi", proj = TRUE, bc = "bc3", vce = "hc2",
  subset = NULL, rotnorm = TRUE)
# S3 method for lspkselect
print(x, ...)
# S3 method for lspkselect
summary(object, ...)

Value

ks: A matrix may contain k.rot (IMSE-optimal number of knots for the main regression through ROT implementation), k.bias.rot (IMSE-optimal number of knots for bias correction through ROT implementation), k.dpi (IMSE-optimal number of knots for the main regression through DPI implementation), k.bias.dpi (IMSE-optimal number of knots for bias correction through DPI implementation)
opt: A list containing options passed to the function.

Arguments

y

Outcome variable.

x

Independent variable. A matrix or data frame.

m

Order of basis used in the main regression. Default is m=2.

m.bc

Order of basis used to estimate leading bias. Default is m.bc=m+1.

smooth

Smoothness of B-splines for point estimation. When smooth=s, B-splines have s-order continuous derivatives. Default is smooth=m-2.

bsmooth

Smoothness of B-splines for bias correction. Default is bsmooth=m.bc-2.

deriv

Derivative order of the regression function to be estimated. A vector object of the same length as ncol(x). Default is deriv=c(0,...,0).

method

Type of basis used for expansion. Options are "bs" for B-splines, "wav" for compactly supported wavelets (Cohen, Daubechies and Vial, 1993), and "pp" for piecewise polynomials. Default is method="bs".

ktype

Knot placement. Options are "uni" for evenly spaced knots over the support of x and "qua" for quantile-spaced knots. Default is ktype="uni".

kselect

Method for selecting the number of inner knots used by lspkselect. Options are "imse-rot" for a rule-of-thumb (ROT) implementation of IMSE-optimal number of knots, "imse-dpi" for second generation direct plug-in (DPI) implementation of IMSE-optimal number of knots, and "all" for both. Default is kselect="imse-dpi".

proj

If TRUE, projection of leading approximation error onto the lower-order approximating space is included for bias correction (splines and piecewise polynomial only). Default is proj=TRUE.

bc

Bias correction method. Options are "bc1" for higher-order-basis bias correction, "bc2" for least squares bias correction, and "bc3" for plug-in bias correction. Defaults are "bc3" for splines and piecewise polynomials and "bc2" for wavelets.

vce

Procedure to compute the heteroskedasticity-consistent (HCk) variance-covariance matrix estimator with plug-in residuals. Options are

"hc0" for unweighted residuals (HC0).
"hc1" for HC1 weights.
"hc2" for HC2 weights. Default.
"hc3" for HC3 weights.

subset

Optional rule specifying a subset of observations to be used.

rotnorm

If TRUE, ROT selection is adjusted using normal densities.

...

further arguments

object

class lspkselect objects.

Methods (by generic)

print: print method for class "lspkselect".
summary: summary method for class "lspkselect".

Author

Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.

Max H. Farrell, University of Chicago, Chicago, IL. max.farrell@chicagobooth.edu.

Yingjie Feng (maintainer), Princeton University, Princeton, NJ. yingjief@princeton.edu.

References

Cattaneo, M. D., and M. H. Farrell (2013): Optimal convergence rates, Bahadur representation, and asymptotic normality of partitioning estimators. Journal of Econometrics 174(2): 127-143.

Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019a): Large Sample Properties of Partitioning-Based Series Estimators. Annals of Statistics, forthcoming. arXiv:1804.04916.

Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019b): lspartition: Partitioning-Based Least Squares Regression. R Journal, forthcoming. arXiv:1906.00202.

Cohen, A., I. Daubechies, and P.Vial (1993): Wavelets on the Interval and Fast Wavelet Transforms. Applied and Computational Harmonic Analysis 1(1): 54-81.

Examples

Run this code

x   <- data.frame(runif(500), runif(500))
y   <- sin(4*x[,1])+cos(x[,2])+rnorm(500)
est <- lspkselect(y, x)
summary(est)

Run the code above in your browser using DataLab