Learn R Programming

npsp (version 0.3-6)

h.cv: Cross-validation methods for bandwidth selection

Description

Selects the bandwidth of a local polynomial kernel (regression, density or variogram) estimator using (standart or modified) CV, GCV or MASE criteria.

Usage

h.cv(bin, ...)
"h.cv" (bin, objective = c("CV", "GCV", "MASE"), h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = ifelse(objective == "GCV", 0, 1), cov.bin = NULL, DEalgorithm = FALSE, warn = FALSE, ...)
"h.cv" (bin, h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = 1, DEalgorithm = FALSE, warn = FALSE, ...)
hcv.data(bin, objective = c("CV", "GCV", "MASE"), h.start = NULL, h.lower = NULL, h.upper = NULL, degree = 1, ncv = ifelse(objective == "GCV", 0, 1), cov = NULL, DEalgorithm = FALSE, warn = FALSE, ...)

Arguments

bin
object used to select a method (binned data, binned density or binned semivariogram).
...
further arguments passed to or from other methods (e.g. parameters of the optimization routine).
objective
character; optimal criterion to be used ("CV", "GCV" or "MASE").
h.start
vector; initial values for the parameters (diagonal elements) to be optimized over. If DEalgorithm == FALSE (otherwise not used), defaults to (3 + ncv) * lag, where lag = bin$grid$lag.
h.lower
vector; lower bounds on each parameter (diagonal elements) to be optimized. Defaults to (1.5 + ncv) * bin$grid$lag.
h.upper
vector; upper bounds on each parameter (diagonal elements) to be optimized. Defaults to 1.5 * dim(bin) * bin$grid$lag.
DEalgorithm
logical; if TRUE, the differential evolution optimization algorithm in package DEoptim is used.
ncv
integer; determines the number of cells leaved out in each dimension. (0 to GCV considering all the data, $>0$ to traditional or modified cross-validation). See "Details" bellow.
cov.bin
covariance matrix of the binned data. Defaults to identity.
warn
logical; sets the handling of warning messages (normally due to the lack of data in some neighborhoods). If FALSE (the default) all warnings are ignored.
cov
covariance matrix of the data. Defaults to identity (uncorrelated data).
degree
degree of the local polynomial used. Defaults to 1 (local linear estimation).

Value

Returns a list containing the following 3 components:
h
the best (diagonal) bandwidth matrix found.
value
the value of the objective function corresponding to h.
objective
the criterion used.

Details

Currently, only diagonal windows are supported.

h.cv methods use binning approximations to the objective function values. If ncv > 0, estimates are computed by leaving out binning cells with indexes within the intervals $[x_i - ncv + 1, x_i + ncv - 1]$, at each dimension i, where $x$ denotes the index of the estimation position. $ncv = 1$ corresponds with traditional cross-validation and $ncv > 1$ with modified CV (see e.g. Chu and Marron, 1991, for the one dimensional case). For standard GCV, set ncv = 0 (the full data is used). For theoretical MASE, set y = trend.teor, cov = cov.teor and ncv = 0.

If DEalgorithm == FALSE, the "L-BFGS-B" method in optim is used.

hcv.data evaluates the objective functions at the original data (combining a binning approximation to the nonparametric estimates with a linear interpolation). If ncv > 1 (modified CV), a similar algorithm to that in h.cv.bin.data is used, estimates are computed by leaving out binning cells with indexes within the intervals $[x_i - ncv + 1, x_i + ncv - 1]$.

References

Chu, C.K. and Marron, J.S. (1991) Comparison of Two Bandwidth Selectors with Dependent Errors. The Annals of Statistics, 19, 1906-1918.

Francisco-Fernandez M. and Opsomer J.D. (2005) Smoothing parameter selection methods for nonparametric regression with spatially correlated errors. Canadian Journal of Statistics, 33, 539-558.

See Also

locpol, locpolhcv, binning, np.svar.

Examples

Run this code
bin <- binning(earthquakes[, c("lon", "lat")], earthquakes$mag, nbin = c(30,30))
hcv <- h.cv(bin, ncv = 2)
lp <- locpol(bin, h = hcv$h)
## Alternatively:
## lp <- locpolhcv(earthquakes[, c("lon", "lat")], earthquakes$mag, nbin = c(30,30), ncv = 2)

simage(lp, main = 'Smoothed magnitude')
contour(lp, add = TRUE)
with(earthquakes, points(lon, lat, pch = 20))

## Density estimation
hden <- h.cv(as.bin.den(bin))
den <- np.den(bin, h = hden$h)

plot(den, main = 'Estimated log(density)')

Run the code above in your browser using DataLab