energy (version 1.6.2)

# dcov.test: Distance Covariance Test

## Description

Distance covariance test of multivariate independence. Distance covariance and distance correlation are multivariate measures of dependence.

## Usage

`dcov.test(x, y, index = 1.0, R = 199)`

## Arguments

x
data or distances of first sample
y
data or distances of second sample
R
number of replicates
index
exponent on Euclidean distance, in (0,2]

## Value

`dcov.test` returns a list with class `htest` containing
method
description of test
statistic
observed value of the test statistic
estimate
dCov(x,y)
estimates
a vector: [dCov(x,y), dCor(x,y), dVar(x), dVar(y)]
replicates
replicates of the test statistic
p.value
approximate p-value of the test
data.name
description of data

## Details

`dcov.test` performs a nonparametric test of multivariate independence. The test decision is obtained via permutation bootstrap, with `R` replicates. The sample sizes (number of rows) of the two samples must agree, and samples must not contain missing values. Arguments `x`, `y` can optionally be `dist` objects; otherwise these arguments are treated as data. The statistic is \$nV_n^2\$ where \$V_n(x,y)\$ = dcov(x,y), which is based on interpoint Euclidean distances \$||x_{i}-x_{j}||\$. The `index` is an optional exponent on Euclidean distance.

Distance correlation is a new measure of dependence between random vectors introduced by Szekely, Rizzo, and Bakirov (2007). For all distributions with finite first moments, distance correlation \$R\$ generalizes the idea of correlation in two fundamental ways:

(1) \$R(X,Y)\$ is defined for \$X\$ and \$Y\$ in arbitrary dimension. (2) \$R(X,Y)=0\$ characterizes independence of \$X\$ and \$Y\$.

Characterization (2) also holds for powers of Euclidean distance \$|x_i-x_j|^s\$, where \$0

Distance correlation satisfies \$0 \le R \le 1\$, and \$R = 0\$ only if \$X\$ and \$Y\$ are independent. Distance covariance \$V\$ provides a new approach to the problem of testing the joint independence of random vectors. The formal definitions of the population coefficients \$V\$ and \$R\$ are given in (SRB 2007). The definitions of the empirical coefficients are given in the energy `dcov` topic.

For all values of the index in (0,2), under independence the asymptotic distribution of \$nV_n^2\$ is a quadratic form of centered Gaussian random variables, with coefficients that depend on the distributions of \$X\$ and \$Y\$. For the general problem of testing independence when the distributions of \$X\$ and \$Y\$ are unknown, the test based on \$n V_n^2\$ can be implemented as a permutation test. See (SRB 2007) for theoretical properties of the test, including statistical consistency.

## References

Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007), Measuring and Testing Dependence by Correlation of Distances, Annals of Statistics, Vol. 35 No. 6, pp. 2769-2794. http://dx.doi.org/10.1214/009053607000000505

Szekely, G.J. and Rizzo, M.L. (2009), Brownian Distance Covariance, Annals of Applied Statistics, Vol. 3, No. 4, 1236-1265. http://dx.doi.org/10.1214/09-AOAS312 Szekely, G.J. and Rizzo, M.L. (2009), Rejoinder: Brownian Distance Covariance, Annals of Applied Statistics, Vol. 3, No. 4, 1303-1308.

`dcov ` `dcor ` `DCOR` `dcor.ttest`

## Examples

Run this code
`````` x <- iris[1:50, 1:4]
y <- iris[51:100, 1:4]
set.seed(1)
dcov.test(x, y)
set.seed(1)
dcov.test(dist(x), dist(y))  #same thing
set.seed(1)
dcov.test(x, y, index=.5)
set.seed(1)
dcov.test(dist(x), dist(y), index=.5)  #same thing

## Example with dvar=0 (so dcov=0 and pval=1)
x <- rep.int(1, 10)
y <- 1:10
dcov.test(x, y, R=199)
``````

Run the code above in your browser using DataLab