`npglpreg`

computes a generalized local polynomial kernel
regression estimate (Hall and Racine (2015)) of a one (1)
dimensional dependent variable on an `r`

-dimensional vector of
continuous and categorical
(`factor`

/`ordered`

) predictors.

`npglpreg(…)`# S3 method for default
npglpreg(tydat = NULL,
txdat = NULL,
eydat = NULL,
exdat = NULL,
bws = NULL,
degree = NULL,
leave.one.out = FALSE,
ckertype = c("gaussian", "epanechnikov", "uniform", "truncated gaussian"),
ckerorder = 2,
ukertype = c("liracine", "aitchisonaitken"),
okertype = c("liracine", "wangvanryzin"),
bwtype = c("fixed", "generalized_nn", "adaptive_nn", "auto"),
gradient.vec = NULL,
gradient.categorical = FALSE,
cv.shrink = TRUE,
cv.maxPenalty = sqrt(.Machine$double.xmax),
cv.warning = FALSE,
Bernstein = TRUE,
mpi = FALSE,
…)

# S3 method for formula
npglpreg(formula,
data = list(),
tydat = NULL,
txdat = NULL,
eydat = NULL,
exdat = NULL,
bws = NULL,
degree = NULL,
leave.one.out = FALSE,
ckertype = c("gaussian", "epanechnikov","uniform","truncated gaussian"),
ckerorder = 2,
ukertype = c("liracine", "aitchisonaitken"),
okertype = c("liracine", "wangvanryzin"),
bwtype = c("fixed", "generalized_nn", "adaptive_nn", "auto"),
cv = c("degree-bandwidth", "bandwidth", "none"),
cv.func = c("cv.ls", "cv.aic"),
nmulti = 5,
random.seed = 42,
degree.max = 10,
degree.min = 0,
bandwidth.max = .Machine$double.xmax,
bandwidth.min = sqrt(.Machine$double.eps),
bandwidth.min.numeric = 1.0e-02,
bandwidth.switch = 1.0e+06,
bandwidth.scale.categorical = 1.0e+04,
max.bb.eval = 10000,
min.epsilon = .Machine$double.eps,
initial.mesh.size.real = 1,
initial.mesh.size.integer = 1,
min.mesh.size.real = sqrt(.Machine$double.eps),
min.mesh.size.integer = 1,
min.poll.size.real = 1,
min.poll.size.integer = 1,
opts=list(),
restart.from.min = FALSE,
gradient.vec = NULL,
gradient.categorical = FALSE,
cv.shrink = TRUE,
cv.maxPenalty = sqrt(.Machine$double.xmax),
cv.warning = FALSE,
Bernstein = TRUE,
mpi = FALSE,
…)

formula

a symbolic description of the model to be fit

data

an optional data frame containing the variables in the model

tydat

a one (1) dimensional numeric or integer vector of dependent data, each
element \(i\) corresponding to each observation (row) \(i\) of
`txdat`

. Defaults to the training data used to
compute the bandwidth object

txdat

a \(p\)-variate data frame of explanatory data (training data) used to calculate the regression estimators. Defaults to the training data used to compute the bandwidth object

eydat

a one (1) dimensional numeric or integer vector of the true values of the dependent variable. Optional, and used only to calculate the true errors

exdat

a \(p\)-variate data frame of points on which the regression will be
estimated (evaluation data). By default,
evaluation takes place on the data provided by `txdat`

bws

a vector of bandwidths, with each element \(i\) corresponding
to the bandwidth for column \(i\) in `txdat`

degree

integer/vector specifying the polynomial degree of the
for each dimension of the continuous `x`

in `txdat`

leave.one.out

a logical value to specify whether or not to compute the leave one
out sums. Will not work if `exdat`

is specified. Defaults to
`FALSE`

ckertype

character string used to specify the continuous kernel type.
Can be set as `gaussian`

, `epanechnikov`

, or
`uniform`

. Defaults to `gaussian`

.

ckerorder

numeric value specifying kernel order (one of
`(2,4,6,8)`

). Kernel order specified along with a
`uniform`

continuous kernel type will be ignored. Defaults to
`2`

.

ukertype

character string used to specify the unordered categorical kernel type.
Can be set as `aitchisonaitken`

or `liracine`

. Defaults to
`liracine`

okertype

character string used to specify the ordered categorical kernel type.
Can be set as `wangvanryzin`

or `liracine`

. Defaults to
`liracine`

bwtype

character string used for the continuous variable bandwidth type,
specifying the type of bandwidth to compute and return in the
`bandwidth`

object. If `bwtype="auto"`

, the bandwidth type
type will be automatically determined by cross-validation. Defaults
to `fixed`

. Option summary:
`fixed`

: compute fixed bandwidths
`generalized_nn`

: compute generalized nearest neighbor bandwidths
`adaptive_nn`

: compute adaptive nearest neighbor bandwidths

cv

a character string (default `cv="nomad"`

) indicating
whether to use nonsmooth mesh adaptive direct search, or no search
(i.e. use supplied values for `degree`

and `bws`

)

cv.func

a character string (default `cv.func="cv.ls"`

)
indicating which method to use to select smoothing
parameters. `cv.aic`

specifies expected Kullback-Leibler
cross-validation (Hurvich, Simonoff, and Tsai (1998)), and
`cv.ls`

specifies least-squares cross-validation

max.bb.eval

argument passed to the NOMAD solver (see `snomadr`

for
further details)

min.epsilon

argument passed to the NOMAD solver (see `snomadr`

for
further details)

initial.mesh.size.real

argument passed to the NOMAD solver (see `snomadr`

for
further details)

initial.mesh.size.integer

argument passed to the NOMAD solver (see `snomadr`

for
further details)

min.mesh.size.real

argument passed to the NOMAD solver (see `snomadr`

for
further details)

min.mesh.size.integer

arguments passed to the NOMAD solver (see `snomadr`

for
further details)

min.poll.size.real

arguments passed to the NOMAD solver (see `snomadr`

for
further details)

min.poll.size.integer

arguments passed to the NOMAD solver (see `snomadr`

for
further details)

opts

list of optional arguments passed to the NOMAD solver
(see `snomadr`

for further details)

nmulti

integer number of times to restart the process of finding extrema of
the cross-validation function from different (random) initial
points (default `nmulti=5`

)

random.seed

when it is not missing and not equal to 0, the
initial points will be generated using this seed when using
`snomadr`

degree.max

the maximum degree of the polynomial for
each of the continuous predictors (default `degree.max=10`

)

degree.min

the minimum degree of the polynomial for
each of the continuous predictors (default `degree.min=0`

)

bandwidth.max

the maximum bandwidth scale (i.e. number of
scaled standard deviations) for each of the continuous predictors
(default `bandwidth.max=.Machine$double.xmax`

)

bandwidth.min

the minimum bandwidth scale for each of the
categorical predictors (default `sqrt(.Machine$double.eps)`

)

bandwidth.min.numeric

the minimum bandwidth scale (i.e. number
of scaled standard deviations) for each of the continuous predictors
(default `bandwidth.min=1.0e-02`

)

bandwidth.switch

the minimum bandwidth scale (i.e. number of
scaled standard deviations) for each of the continuous predictors
(default `bandwidth.switch=1.0e+06`

) before the local polynomial
is treated as global during cross-validation at which point a global
categorical kernel weighted least squares fit is used for
computational efficiency

bandwidth.scale.categorical

the upper end for the rescaled
bandwidths for the categorical predictors (default
`bandwidth.scale.categorical=1.0e+04`

) - the aim is to ‘even up’
the scale of the search parameters as much as possible, so when very
large scale factors are selected for the continuous predictors, a
larger value may improve search

restart.from.min

a logical value indicating to recommence
`snomadr`

with the optimal values found from its first
invocation (typically quick but sometimes recommended in the field of
optimization)

gradient.vec

a vector corresponding to the order of the partial (or cross-partial) and which variable the partial (or cross-partial) derivative(s) are required

gradient.categorical

a logical value indicating whether discrete gradients (i.e. differences in the response from the base value for each categorical predictor) are to be computed

cv.shrink

a logical value indicating whether to use ridging
(Seifert and Gasser (2000)) for ill-conditioned inversion during
cross-validation (default `cv.shrink=TRUE`

) or to instead test
for ill-conditioned matrices and penalize heavily when this is the
case (much stronger condition imposed on cross-validation)

cv.maxPenalty

a penalty applied during cross-validation when a delete-one estimate is not finite or the polynomial basis is not of full column rank

cv.warning

a logical value indicating whether to issue an
immediate warning message when ill conditioned bases are encountered
during cross-validation (default `cv.warning=FALSE`

)

Bernstein

a logical value indicating whether to use raw polynomials or Bernstein polynomials (default) (note that a Bernstein polynomial is also know as a Bezier curve which is also a B-spline with no interior knots)

mpi

a logical value (default `mpi=FALSE`

) that, when
`mpi=TRUE`

, can call the `npRmpi`

rather than the `np`

package (note - code needs to mirror examples in the demo directory of
the `npRmpi`

package, you need to broadcast loading of the
`crs`

package, and need `.Rprofile`

in your current
directory)

…

additional arguments supplied to specify the regression type, bandwidth type, kernel types, training data, and so on, detailed below

`npglpreg`

returns a `npglpreg`

object. The generic
functions `fitted`

and `residuals`

extract
(or generate) estimated values and residuals. Furthermore, the
functions `summary`

, `predict`

, and
`plot`

(options `deriv=0`

, `ci=FALSE`

[`ci=TRUE`

produces pointwise bootstrap error bounds],
`persp.rgl=FALSE`

,
`plot.behavior=c("plot","plot-data","data")`

,
`plot.errors.boot.num=99`

,
`plot.errors.type=c("quantiles","standard")`

[`"quantiles"`

produces percentiles determined by
`plot.errors.quantiles`

below, `"standard"`

produces error
bounds given by +/- 1.96 bootstrap standard deviations],
`plot.errors.quantiles=c(.025,.975)`

, `xtrim=0.0`

,
`xq=0.5`

) support objects of this type. The returned object has
the following components:

estimates of the regression function (conditional mean) at the sample points or evaluation points

residuals computed at the sample points or evaluation points

integer/vector specifying the degree of the polynomial
for each dimension of the continuous `x`

the estimated gradient (vector) corresponding to the vector
`gradient.vec`

the estimated gradient (matrix) for the categorical predictors

the supplied `gradient.vec`

vector of bandwidths

the supplied `bwtype`

a symbolic description of the model

coefficient of determination (Doksum and Samarov (1995))

Typical usages are (see below for a list of options and also the examples at the end of this help file)

## Conduct generalized local polynomial estimationmodel <- npglpreg(y~x1+x2)

## Conduct degree 0 local polynomial estimation ## (i.e. Nadaraya-Watson)

model <- npglpreg(y~x1+x2,cv="bandwidth",degree=c(0,0))

## Conduct degree 1 local polynomial estimation (i.e. local linear)

model <- npglpreg(y~x1+x2,cv="bandwidth",degree=c(1,1))

## Conduct degree 2 local polynomial estimation (i.e. local ## quadratic)

model <- npglpreg(y~x1+x2,cv="bandwidth",degree=c(2,2))

## Plot the mean and bootstrap confidence intervals

plot(model,ci=TRUE)

## Plot the first partial derivatives and bootstrap confidence ## intervals

plot(model,deriv=1,ci=TRUE)

## Plot the first second partial derivatives and bootstrap ## confidence intervals

plot(model,deriv=2,ci=TRUE)

This function is in beta status until further notice (eventually it will be rolled into the np/npRmpi packages after the final status of snomadr/NOMAD gets sorted out).

Optimizing the cross-validation function jointly for bandwidths
(vectors of continuous parameters) and polynomial degrees (vectors of
integer parameters) constitutes a mixed-integer optimization
problem. These problems are not only ‘hard’ from the numerical
optimization perspective, but are also computationally intensive
(contrast this to where we conduct, say, local linear regression which
sets the degree of the polynomial vector to a global value
`degree=1`

hence we only need to optimize with respect to the
continuous bandwidths). Because of this we must be mindful of the
presence of local optima (the objective function is non-convex and
non-differentiable). Restarting search from different initial starting
points is recommended (see `nmulti`

) and by default this is done
more than once. We encourage users to adopt ‘multistarting’ and
to investigate the impact of changing default search parameters such
as `initial.mesh.size.real`

, `initial.mesh.size.integer`

,
`min.mesh.size.real`

,
`min.mesh.size.integer`

,`min.poll.size.real`

, and
`min.poll.size.integer`

. The default values were chosen based on
extensive simulation experiments and were chosen so as to yield robust
performance while being mindful of excessive computation - of course,
no one setting can be globally optimal.

Doksum, K. and A. Samarov (1995), “Nonparametric Estimation of Global Functionals and a Measure of the Explanatory Power of Covariates in Regression,” The Annals of Statistics, 23, 1443-1473.

Hall, P. and J.S. Racine (2015), “Infinite Order Cross-Validated Local Polynomial Regression,” Journal of Econometrics, 185, 510-525.

Li, Q. and J.S. Racine (2007), *Nonparametric Econometrics:
Theory and Practice,* Princeton University Press.

Seifert, B. and T. Gasser (2000), “Data Adaptive Ridging in Local Polynomial Regression,” Journal of Computational and Graphical Statistics, 9(2), 338-360.

# NOT RUN { set.seed(42) n <- 100 x1 <- runif(n,-2,2) x2 <- runif(n,-2,2) y <- x1^3 + rnorm(n,sd=1) ## Ideally the method should choose large bandwidths for x1 and x2 and a ## generalized polynomial that is a cubic for x1 and degree 0 for x2. model <- npglpreg(y~x1+x2,nmulti=1) summary(model) ## Plot the partial means and percentile confidence intervals plot(model,ci=T) ## Extract the data from the plot object and plot it separately myplot.dat <- plot(model,plot.behavior="data",ci=T) matplot(myplot.dat[[1]][,1],myplot.dat[[1]][,-1],type="l") matplot(myplot.dat[[2]][,1],myplot.dat[[2]][,-1],type="l") # }