np.gcv

0th

Percentile

Generalized cross-validation bandwidth selection in nonparametric regression models

From a sample ${(Y_i, t_i): i=1,...,n}$, this routine computes an optimal bandwidth for estimating $m$ in the regression model $$Y_i= m(t_i) + \epsilon_i.$$ The regression function, $m$, is a smooth but unknown function. The optimal bandwidth is selected by means of the generalized cross-validation procedure. Kernel smoothing is used.

Keywords
regression, time series, Statistical Inference, Nonparametric Statistics
Usage
np.gcv(data = data, h.seq=NULL, num.h = 50, estimator = "NW", 
kernel = "quadratic")
Arguments
data
data[, 1] contains the values of the response variable, $Y$; data[, 2] contains the values of the explanatory variable, $t$.
h.seq
sequence of considered bandwidths in the GCV function. If NULL (the default), num.h equidistant values between zero and a quarter of the range of $t_i$ are considered.
num.h
number of values used to build the sequence of considered bandwidths. If h.seq is not NULL, num.h=length(h.seq). Otherwise, the default is 50.
estimator
allows us the choice between NW (Nadaraya-Watson) or LLP (Local Linear Polynomial). The default is NW.
kernel
allows us the choice between gaussian, quadratic (Epanechnikov kernel), triweight or uniform kernel. The default is quadratic.
Details

See Craven and Wahba (1979) and Rice (1984).

Value

  • h.optselected value for the bandwidth.
  • GCV.optminimum value of the GCV function.
  • GCVvector containing the values of the GCV function for each considered bandwidth.
  • h.seqsequence of considered bandwidths in the GCV function.

References

Craven, P. and Wahba, G. (1979) Smoothing noisy data with spline functions. Numer. Math. 31, 377-403. Rice, J. (1984) Bandwidth choice for nonparametric regression. Ann. Statist. 12, 1215-1230.

See Also

Other related functions are: np.est, np.cv, plrm.est, plrm.gcv and plrm.cv.

Aliases
  • np.gcv
Examples
# EXAMPLE 1: REAL DATA
data <- matrix(10,120,2)
data(barnacles1)
barnacles1 <- as.matrix(barnacles1)
data[,1] <- barnacles1[,1]
data <- diff(data, 12)
data[,2] <- 1:nrow(data)

aux <- np.gcv(data)
aux$h.opt
plot(aux$h.seq, aux$GCV, xlab="h", ylab="GCV", type="l")



# EXAMPLE 2: SIMULATED DATA
## Example 2a: independent data

set.seed(1234)
# We generate the data
n <- 100
t <- ((1:n)-0.5)/n
m <- function(t) {0.25*t*(1-t)}
f <- m(t)

epsilon <- rnorm(n, 0, 0.01)
y <-  f + epsilon
data_ind <- matrix(c(y,t),nrow=100)

# We apply the function
a <-np.gcv(data_ind)
a$GCV.opt

GCV <- a$GCV
h <- a$h.seq
plot(h, GCV, type="l")
Documentation reproduced from package PLRModels, version 1.1, License: GPL

Community examples

Looks like there are no examples yet.