eqdist.etest: Multisample E-statistic (Energy) Test of Equal Distributions

Description

Performs the nonparametric multisample E-statistic (energy) test for equality of multivariate distributions.

Usage

eqdist.etest(x, sizes, distance = FALSE, R = 999)
 eqdist.e(x, sizes, distance = FALSE)

Arguments

data matrix of pooled sample

sizes

vector of sample sizes

distance

logical: if TRUE, first argument is a distance matrix

number of bootstrap replicates

Value

A list with class htest containing
methoddescription of test
statisticobserved value of the test statistic
p.valueapproximate p-value of the test
data.namedescription of data
eqdist.e returns test statistic only.

concept

energy statistics

Details

The k-sample multivariate $\mathcal{E}$-test of equal distributions is performed. The statistic is computed from the original pooled samples, stacked in matrix x where each row is a multivariate observation, or the corresponding distance matrix. The first sizes[1] rows of x are the first sample, the next sizes[2] rows of x are the second sample, etc. The test is implemented by nonparametric bootstrap, an approximate permutation test with R replicates. For large samples it is more efficient if x contains the data matrix rather than the distances. The function eqdist.e returns the test statistic only; it simply passes the arguments through to eqdist.etest with R = 0. For computing the statistic only (no test), ksample.e is usually faster. The definition of the multisample $\mathcal{E}$-statistic is given in the ksample.e documentation.

References

Szekely, G. J. and Rizzo, M. L. (2004) Testing for Equal Distributions in High Dimension, InterStat, November (5). Szekely, G. J. (2000) Technical Report 03-05: $\mathcal{E}$-statistics: Energy of Statistical Samples, Department of Mathematics and Statistics, Bowling Green State University.

Examples

Run this code

data(iris)
 
 ## test if the 3 varieties of iris data (d=4) have equal distributions
 eqdist.etest(iris[,1:4], c(50,50,50), R = 199)

eqdist.e(iris[,1:4], c(50,50,50))
  x <- matrix(rnorm(200), nrow=40)
  y <- matrix(rnorm(250, mean=5), nrow=50)
  x <- rbind(x, y)
  eqdist.etest(dist(x), sizes=c(40, 50), distance=TRUE, R = 19)
  eqdist.e(dist(x), sizes=c(40, 50), distance=TRUE)

Run the code above in your browser using DataLab