Learn R Programming

SCBmeanfd (version 1.1)

cv.select: Cross-Validation Bandwidth Selection for Local Polynomial Estimation

Description

Select the cross-validation bandwidth described in Rice and Silverman (1991) for the local polynomial estimation of a mean function based on functional data.

Usage

cv.select(x, y, degree, interval = NULL, ...)

Arguments

x
numeric vector of x data. x must be a uniform grid; missing values are not accepted.
y
matrix or data frame with functional observations (= curves) stored in rows. The number of columns of y must match the length of x. Missing values are not accepted.
degree
degree of local polynomial used.
interval
numeric vector of length 2; the lower and upper bounds of the search interval.
...
additional arguments to pass to the optimization function optimize.

Value

a bandwidth that minimizes the cross-validation score.

Details

The cross-validation score is obtained by leaving out one entire curve at a time and computing the prediction error of the local polynomial smoother based on all other curves. For a bandwith value $h$, this score is $$ S(h) = \sum_{i=1}^n \sum_{j=1}^p \left( Y_{ij} - \hat{\mu}^{-(i)}(t_j;h) \right)^2, $$ where $Yij$ is the measurement of the $i$-th curve at time $tj$, and $mui(tj;h)$ is the local polynomial estimator with bandwidth $h$ based on all curves except the $i$-th.

cv.select uses the standard R function optimize to optimize cv.score. If the argument interval is not specified, the lower bound of the search interval is by default $(x[2] - x[1])/2$ if $degree < 2$ and $x[2] - x[1]$ if $degree >= 2$. The default value of the upper bound is $(max(x) - min(x))/2$. These values guarantee in most cases that the local polynomial estimator is well defined. It is often useful to plot the function to be optimized for a range of argument values (grid search) before applying a numerical optimizer. In this way, the search interval can be narrowed down and the optimizer is more likely to find a global solution.

References

Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society. Series B (Methodological), 53, 233--243.

See Also

cv.score, plugin.select

Examples

Run this code
## Not run: 
# ## Plasma citrate data
# ## Compare cross-validation scores and bandwidths  
# ## for local linear and local quadratic smoothing
# 
# data(plasma)
# time <- 8:21   				
# ## Local linear smoothing						
# cv.select(time, plasma, 1)	# local solution h = 3.76, S(h) = 463.08			
# cv.select(time, plasma, 1, interval = c(.5, 1))	# global solution = .75, S(h) = 439.54
# 
# ## Local quadratic smoothing						
# cv.select(time, plasma, 2)	# global solution h = 1.15, S(h) = 432.75			
# cv.select(time, plasma, 2, interval = c(1, 1.5))	# same
# ## End(Not run)

Run the code above in your browser using DataLab