Learn R Programming

sm (version 2.0-2)

hcv: Cross-validatory choice of smoothing parameter

Description

This function uses the technique of cross-validation to select a smoothing parameter suitable for constructing a density estimate or nonparametric regression curve in one or two dimensions.

Usage

hcv(x, y = NA, hstart = NA, hend = NA, ...)

Arguments

x
a vector, or two-column matrix of data. If y is missing these are observations to be used in the construction of a density estimate. If y is present, these are the covariate values for a nonparametric regression.
y
a vector of response values for nonparametric regression.
hstart
the smallest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.
hend
the largest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.
...
other optional parameters are passed to the sm.options function, through a mechanism which limits their effect only to this call of the function; those relevant for this function are the following:
h.weights
a vector of weights which multiply the smoothing parameter(s) used in the kernel function at each observation.
ngrid
the number of grid points to be used in an initial grid search for the value of the smoothing parameter. Default: ngrid=8.
display
any character setting other than "none" will cause the criterion function to be plotted over the search grid of smoothing parameters. The particular value "log" will use a log scale for the grid values.
add
controls whether the plot is added to an existing graph. Default: add=F.

Value

  • the value of the smoothing parameter which minimises the cross-validation criterion over the selected grid.

Side Effects

If the minimising value is located at the end of the grid of search positions, or if some values of the cross-validatory criterion cannot be evaluated, then a warning message is printed. In these circumstances altering the values of hstart and hend may improve performance.

Details

See Sections 2.4 and 4.5 of the reference below.

The two-dimensional case uses a smoothing parameter derived from a single value, scaled by the standard deviation of each component.

This function does not employ a sophisticated algorithm and some adjustment of the search parameters may be required for different sets of data. An initial estimate of the value of h which minimises the cross-validatory criterion is located from a grid search using values which are equally spaced on a log scale between hstart and hend. A quadratic approximation is then used to refine this initial estimate.

References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

See Also

cv, hsj, hnorm

Examples

Run this code
#  Density estimation

x <- rnorm(50)
par(mfrow=c(1,2))
h.cv <- hcv(x, display="lines", ngrid=32)
sm.density(x, h=hcv(x))
par(mfrow=c(1,1))

#  Nonparametric regression

x <- seq(0, 1, length = 50)
y <- rnorm(50, sin(2 * pi * x), 0.2)
par(mfrow=c(1,2))
h.cv <- hcv(x, y, display="lines", ngrid=32)
sm.regression(x, y, h=hcv(x, y))
par(mfrow=c(1,1))

Run the code above in your browser using DataLab