np.gcv
From PLRModels v1.1
by German Perez
Generalized cross-validation bandwidth selection in nonparametric regression models
From a sample ${(Y_i, t_i): i=1,...,n}$, this routine computes an optimal bandwidth for estimating $m$ in the regression model $$Y_i= m(t_i) + \epsilon_i.$$ The regression function, $m$, is a smooth but unknown function. The optimal bandwidth is selected by means of the generalized cross-validation procedure. Kernel smoothing is used.
Usage
np.gcv(data = data, h.seq=NULL, num.h = 50, estimator = "NW",
kernel = "quadratic")
Arguments
- data
data[, 1]
contains the values of the response variable, $Y$;data[, 2]
contains the values of the explanatory variable, $t$.- h.seq
- sequence of considered bandwidths in the GCV function. If
NULL
(the default),num.h
equidistant values between zero and a quarter of the range of $t_i$ are considered. - num.h
- number of values used to build the sequence of considered bandwidths. If
h.seq
is notNULL
,num.h=length(h.seq)
. Otherwise, the default is 50. - estimator
- allows us the choice between
NW (Nadaraya-Watson) orLLP (Local Linear Polynomial). The default isNW . - kernel
- allows us the choice between
gaussian ,quadratic (Epanechnikov kernel),triweight oruniform kernel. The default isquadratic .
Details
See Craven and Wahba (1979) and Rice (1984).
Value
h.opt selected value for the bandwidth. GCV.opt minimum value of the GCV function. GCV vector containing the values of the GCV function for each considered bandwidth. h.seq sequence of considered bandwidths in the GCV function.
References
Craven, P. and Wahba, G. (1979) Smoothing noisy data with spline functions. Numer. Math. 31, 377-403. Rice, J. (1984) Bandwidth choice for nonparametric regression. Ann. Statist. 12, 1215-1230.
See Also
Other related functions are: np.est
, np.cv
, plrm.est
, plrm.gcv
and plrm.cv
.
Examples
# EXAMPLE 1: REAL DATA
data <- matrix(10,120,2)
data(barnacles1)
barnacles1 <- as.matrix(barnacles1)
data[,1] <- barnacles1[,1]
data <- diff(data, 12)
data[,2] <- 1:nrow(data)
aux <- np.gcv(data)
aux$h.opt
plot(aux$h.seq, aux$GCV, xlab="h", ylab="GCV", type="l")
# EXAMPLE 2: SIMULATED DATA
## Example 2a: independent data
set.seed(1234)
# We generate the data
n <- 100
t <- ((1:n)-0.5)/n
m <- function(t) {0.25*t*(1-t)}
f <- m(t)
epsilon <- rnorm(n, 0, 0.01)
y <- f + epsilon
data_ind <- matrix(c(y,t),nrow=100)
# We apply the function
a <-np.gcv(data_ind)
a$GCV.opt
GCV <- a$GCV
h <- a$h.seq
plot(h, GCV, type="l")
Community examples
Looks like there are no examples yet.