kde
computes
$$\hat{f}(\bold{x}) = n^{-1} \sum_{i=1}^n K_{\bold{{\rm H}}} (\bold{x} - \bold{X}_i).$$ The bandwidth matrix $\bold{{\rm H}}$ is a matrix of smoothing
parameters and its choice is crucial for the performance of kernel
estimators. For display, its plot
method calls plot.kde
.
--For kernel density estimators, there are several varieties of bandwidth selectors
hpi
(1-d);Hpi
,Hpi.diag
(2- to 6-d)hlscv
(1-d);Hlscv
,Hlscv.diag
(2- to 6-d)Hbcv
,Hbcv.diag
(2- to 6-d)hscv
(1-d);Hscv
,Hscv.diag
(2- to 6-d)hns
(1-d);Hns
(2- to 6-d).kdde
$${\sf D}^{\otimes r}\hat{f}(\bold{x}) = n^{-1} \sum_{i=1}^n {\sf
D}^{\otimes r}K_{\bold{{\rm H}}} (\bold{x} -
\bold{X}_i).$$
The bandwidth selectors are a modified subset of those for
kde
, i.e. Hlscv
, Hns
, Hpi
, Hscv
with deriv.order>0
.
Its plot
method is plot.kdde
for plotting each
partial derivative singly. --For kernel discriminant analysis, the main function is
kda
which computes density estimates for each the
groups in the training data, and the discriminant surface.
Its plot
method is plot.kda
. The wrapper function
hkda
, Hkda
computes
bandwidths for each group in the training data for kde
,
e.g. hpi
, Hpi
.
--For kernel functional estimation, the main function is
kfe
which computes the $r$-th order integrated density functional
$$\hat{{\bold \psi}}_r = n^{-2} \sum_{i=1}^n \sum_{j=1}^n {\sf D}^{\otimes r}K_{\bold{{\rm H}}}(\bold{X}_i-\bold{X}_j).$$ The plug-in selectors are hpi.kfe
(1-d), Hpi.kfe
(2- to 6-d).
Kernel function estimates are usually not required to computed
directly by the user, but only within other functions in the package.
--For kernel-based 2-sample testing, the main function is
kde.test
which computes the integrated
$L_2$ distance between the two density estimates as the test
statistic, comprising a linear combination of 0-th order kernel
functional estimates:
$$\hat{T} = \hat{\psi}_{0,1} + \hat{\psi}_{0,2} - (\hat{\psi}_{0,12} +
\hat{\psi}_{0,21}),$$ and the corresponding p-value. The $\psi$ are
zero order kernel functional estimates with the subscripts indicating
that 1 = sample 1 only, 2 = sample 2 only, and 12, 21 =
samples 1 and 2. The bandwidth selectors are hpi.kfe
,
Hpi.kfe
with deriv.order=0
.
--For kernel-based local 2-sample testing, the main function is
kde.local.test
which computes the squared distance
between the two density estimates as the test
statistic $$\hat{U}(\bold{x}) = [\hat{f}_1(\bold{x}) - \hat{f}_2(\bold{x})]^2$$ and the corresponding local
p-values. The bandwidth selectors are those used with kde
,
e.g. hpi, Hpi
.
--For kernel cumulative distribution function estimation, the main
function is kcde
$$\hat{F}(\bold{x}) = n^{-1} \sum_{i=1}^n
\mathcal{K}_{\bold{{\rm H}}} (\bold{x} - \bold{X}_i)$$
where $\mathcal{K}$ is the integrated kernel.
The bandwidth selectors are hpi.kcde
,
Hpi.kcde
. Its plot
method is
plot.kcde
.
There exist analogous functions for the survival function $\hat{\bar{F}}$.
--For kernel estimation of a ROC (receiver operating characteristic)
curve to compare two samples from $\hat{F}_1,
\hat{F}_2$, the main function is kroc
$$(\hat{F}_{\hat{Y}_1}(z),
\hat{F}_{\hat{Y}_2}(z))$$ based on the cumulative distribution functions of
$\hat{Y}_j = \hat{\bar{F}}_1(\bold{X}_j), j=1,2$.
The bandwidth selectors are those used with kcde
,
e.g. hpi.kcde, Hpi.kcde
for
$\hat{F}_{\hat{Y}_j}, \hat{\bar{F}}_1$. Its plot
method
is plot.kroc
.
--For kernel estimation of a copula, the
main function is kcopula
$$\hat{C}(\bold{z}) = \hat{F}(\hat{F}_1^{-1}(z_1), \dots,
\hat{F}_d^{-1}(z_d))$$
where $\hat{F}_j^{-1}(z_j)$ is
the $z_j$-th quantile of of the $j$-th marginal
distribution $\hat{F}_j$.
The bandwidth selectors are those used with kcde
for
$\hat{F}, \hat{F}_j$.
Its plot
method is plot.kcde
.
--For kernel estimation of a copula density, the
main function is kcopula.de
$$\hat{c}(\bold{z}) = n^{-1} \sum_{i=1}^n
K_{\bold{{\rm H}}} (\bold{z} - \hat{\bold{Z}}_i)$$
where $\hat{\bold{Z}}_i = (\hat{F}_1(X_{i1}), \dots,
\hat{F}_d(X_{id}))$.
The bandwidth selectors are those used with kde
for
$\hat{c}$ and kcde
for $\hat{F}_j$.
Its plot
method is plot.kde
.
--Binned kernel estimation is available for d = 1, 2, 3, 4. This makes kernel estimators feasible for large samples.
--For an overview of this package with 2-d density estimation, see
vignette("kde")
.
Scott, D.W. (1992) Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, New York.
Silverman, B. (1986) Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, London.
Simonoff, J. S. (1996) Smoothing Methods in Statistics. Springer-Verlag. New York.
Wand, M.P. & Jones, M.C. (1995) Kernel Smoothing. Chapman & Hall/CRC, London.
sm
, KernSmooth