hcv: Cross-validatory choice of smoothing parameter

Description

This function uses the technique of cross-validation to select a smoothing parameter suitable for constructing a density estimate or nonparametric regression curve in one or two dimensions.

Usage

hcv(x, y = NA, hstart = NA, hend = NA, ...)

Arguments

a vector, or two-column matrix of data. If y is missing these are observations to be used in the construction of a density estimate. If y is present, these are the covariate values for a nonparametric regression.

a vector of response values for nonparametric regression.

hstart

the smallest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.

hend

the largest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.

...

other optional parameters are passed to the sm.options function, through a mechanism which limits their effect only to this call of the function. Those specifically relevant for this function are the following: h.weights, ngrid, display, add; see the documentation of sm.options for their description.

Value

the value of the smoothing parameter which minimises the cross-validation criterion over the selected grid.

Side Effects

If the minimising value is located at the end of the grid of search positions, or if some values of the cross-validatory criterion cannot be evaluated, then a warning message is printed. In these circumstances altering the values of hstart and hend may improve performance.

Details

See Sections 2.4 and 4.5 of the reference below.

The two-dimensional case uses a smoothing parameter derived from a single value, scaled by the standard deviation of each component.

This function does not employ a sophisticated algorithm and some adjustment of the search parameters may be required for different sets of data. An initial estimate of the value of h which minimises the cross-validatory criterion is located from a grid search using values which are equally spaced on a log scale between hstart and hend. A quadratic approximation is then used to refine this initial estimate.

References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

Examples

Run this code

#  Density estimation

x <- rnorm(50)
par(mfrow=c(1,2))
h.cv <- hcv(x, display="lines", ngrid=32)
sm.density(x, h=hcv(x))
par(mfrow=c(1,1))

#  Nonparametric regression

x <- seq(0, 1, length = 50)
y <- rnorm(50, sin(2 * pi * x), 0.2)
par(mfrow=c(1,2))
h.cv <- hcv(x, y, display="lines", ngrid=32)
sm.regression(x, y, h=hcv(x, y))
par(mfrow=c(1,1))

Run the code above in your browser using DataLab