kedd (version 1.0.3)

h.ccv: Complete Cross-Validation for Bandwidth Selection

Description

The (S3) generic function h.ccv computes the complete cross-validation bandwidth selector of r'th derivative of kernel density estimator one-dimensional.

Usage

h.ccv(x, ...) "h.ccv"(x, deriv.order = 0, lower = 0.1 * hos, upper = hos, tol = 0.1 * lower, kernel = c("gaussian", "triweight", "tricube", "biweight", "cosine"), ...)

Arguments

x
vector of data values.
deriv.order
derivative order (scalar).
lower, upper
range over which to minimize. The default is almost always satisfactory. hos (Over-smoothing) is calculated internally from an kernel, see details.
tol
the convergence tolerance for optimize.
kernel
a character string giving the smoothing kernel to be used, with default "gaussian".
...
further arguments for (non-default) methods.

Value

x
data points - same as input.
data.name
the deparsed name of the x argument.
n
the sample size after elimination of missing values.
kernel
name of kernel to use
deriv.order
the derivative order to use.
h
value of bandwidth parameter.
min.ccv
the minimal CCV value.

Details

h.ccv complete cross-validation implements for choosing the bandwidth $h$ of a r'th derivative kernel density estimator. Jones and Kappenman (1991) proposed a so-called complete cross-validation (CCV) in kernel density estimator. This method can be extended to the estimation of derivative of the density, basing our estimate of integrated squared density derivative (Peter and Marron 1987) on the $bar(theta)(h;r)$'s, we get the following, start from $R(hat(f)(h;r)) - bar(theta)(h;r)$ as an estimate of MISE. Thus, $h(r)_(CCV)$, say, is the $h$ that minimises: $$CCV(h;r)=R\left(\hat{f}_{h}^{(r)}\right)-\bar{\theta}_{r}(h)+\frac{1}{2}\mu_{2}(K) h^{2} \bar{\theta}_{r+1}(h)+\frac{1}{24}\left(6\mu_{2}^{2}(K) -\delta(K)\right)h^{4}\bar{\theta}_{r+2}(h)$$ with $$R\left(\hat{f}_{h}^{(r)}\right) = \int \left(\hat{f}_{h}^{(r)}(x)\right)^{2} dx = \frac{R\left(K^{(r)}\right)}{nh^{2r+1}} + \frac{(-1)^{r}}{n (n-1) h^{2r+1}} \sum_{i=1}^{n}\sum_{j=1;j \neq i}^{n} K^{(r)} \ast K^{(r)} \left(\frac{X_{j}-X_{i}}{h}\right)$$ and $$\bar{\theta}_{r}(h)= \frac{(-1)^r}{n(n-1) h^{2r+1}} \sum_{i=1}^{n} \sum_{j=1;j \neq i}^{n} K^{(2r)} \left(\frac{X_{j}-X_{i}}{h}\right)$$ and $K(x;r)*K(x;r)$ is the convolution of the r'th derivative kernel function $K(x;r)$ (see kernel.conv and kernel.fun); $R(K(x;r)) = int K(x;r)^2 dx$ and $mu(K(x)) = int x^2 K(x) dx$, $delta(K(x)) = int x^4 K(x) dx$. The range over which to minimize is hos Oversmoothing bandwidth, the default is almost always satisfactory. See George and Scott (1985), George (1990), Scott (1992, pp 165), Wand and Jones (1995, pp 61).

References

Jones, M. C. and Kappenman, R. F. (1991). On a class of kernel density estimate bandwidth selectors. Scandinavian Journal of Statistics, 19, 337--349.

Peter, H. and Marron, J.S. (1987). Estimation of integrated squared density derivatives. Statistics and Probability Letters, 6, 109--115.

See Also

plot.h.ccv.

Examples

Run this code
## Derivative order = 0

h.ccv(kurtotic,deriv.order = 0)

## Derivative order = 1

h.ccv(kurtotic,deriv.order = 1)

Run the code above in your browser using DataCamp Workspace