Kernel density based local two-sample comparison test for 1- to 6-dimensional data.
kde.local.test(x1, x2, H1, H2, h1, h2, fhat1, fhat2, gridsize, binned=FALSE,
bgridsize, verbose=FALSE, supp=3.7, mean.adj=FALSE, signif.level=0.05,
min.ESS, xmin, xmax)
vector/matrix of data values
bandwidth matrices/scalar bandwidths. If these are missing, Hpi
or hpi
is called by default.
objects of class kde
flag for binned estimation. Default is FALSE.
vector of grid sizes
vector of binning grid sizes
flag to print out progress information. Default is FALSE.
effective support for normal kernel
flag to compute second order correction for mean value of critical sampling distribution. Default is FALSE. Currently implemented for d<=2 only.
significance level. Default is 0.05.
minimum effective sample size. See below for details.
vector of minimum/maximum values for grid
A kernel two-sample local significance is an object of class
kde.loctest
which is a list with fields:
kernel density estimates, objects of class kde
chi squared test statistic
matrix of local p-values at each grid point
difference of KDEs
mean of the test statistic
variance of the test statistic
binary matrix to indicate locally signficant fhat1 > fhat2
binary matrix to indicate locally signficant fhat1 < fhat2
sample sizes
bandwidth matrices/scalar bandwidths
The null hypothesis is \(H_0(\bold{x}): f_1(\bold{x}) = f_2(\bold{x})\) where \(f_1, f_2\) are the respective density functions. The measure of discrepancy is \(U(\bold{x}) = [f_1(\bold{x}) - f_2(\bold{x})]^2\). Duong (2013) shows that the test statistic obtained, by substituting the KDEs for the true densities, has a null distribution which is asymptotically chi-squared with 1 d.f.
The required input is either x1,x2
and H1,H2
, or
fhat1,fhat2
, i.e. the data values and bandwidths or objects of class
kde
. In the former case, the kde
objects are created.
If the H1,H2
are missing then the default are the plugin
selectors Hpi
. Likewise for missing h1,h2
.
The mean.adj
flag determines whether the
second order correction to the mean value of the test statistic should be computed.
min.ESS
is borrowed from Godtliebsen et al. (2002)
to reduce spurious significant results in the tails, though by it is usually
not required for small to moderate sample sizes.
Duong, T. (2013) Local signficant differences from non-parametric two-sample tests. Journal of Nonparametric Statistics, 25, 635-645.
Godtliebsen, F., Marron, J.S. & Chaudhuri, P. (2002) Significance in scale space for bivariate density estimation. Journal of Computational and Graphical Statistics, 11, 1-22.
# NOT RUN {
library(MASS)
x1 <- crabs[crabs$sp=="B", 4]
x2 <- crabs[crabs$sp=="O", 4]
loct <- kde.local.test(x1=x1, x2=x2)
plot(loct)
## see examples in ? plot.kde.loctest
# }
Run the code above in your browser using DataCamp Workspace