```
dcov(x, y, index = 1.0)
dcor(x, y, index = 1.0)
DCOR(x, y, index = 1.0)
```

x

data or distances of first sample

y

data or distances of second sample

index

exponent on Euclidean distance, in (0,2]

`dcov`

returns the sample distance covariance and`dcor`

returns the sample distance correlation.`DCOR`

returns a list with elementsdCov sample distance covariance dCor sample distance correlation dVarX distance variance of x sample dVarY distance variance of y sample

- independence
- distance correlation
- distance covariance
- energy statistics

`dcov`

and `dcor`

or `DCOR`

compute distance
covariance and distance correlation statistics.
`DCOR`

is a self-contained R function returning a list of
statistics. `dcor`

execution is faster than `DCOR`

(see examples).
The sample sizes (number of rows) of the two samples must
agree, and samples must not contain missing values. Arguments
`x`

, `y`

can optionally be `dist`

objects;
otherwise these arguments are treated as data.
Distance correlation is a new measure of dependence between random
vectors introduced by Szekely, Rizzo, and Bakirov (2007).
For all distributions with finite first moments, distance
correlation $\mathcal R$ generalizes the idea of correlation in two
fundamental ways:
(1) $\mathcal R(X,Y)$ is defined for $X$ and $Y$ in arbitrary dimension.
(2) $\mathcal R(X,Y)=0$ characterizes independence of $X$ and
$Y$.
Distance correlation satisfies $0 \le \mathcal R \le 1$, and
$\mathcal R = 0$ only if $X$ and $Y$ are independent. Distance
covariance $\mathcal V$ provides a new approach to the problem of
testing the joint independence of random vectors. The formal
definitions of the population coefficients $\mathcal V$ and
$\mathcal R$ are given in (SRB 2007). The definitions of the
empirical coefficients are as follows.
The empirical distance covariance $\mathcal{V}_n(\mathbf{X,Y})$
with index 1 is
the nonnegative number defined by
$$\mathcal{V}^2_n (\mathbf{X,Y}) = \frac{1}{n^2} \sum_{k,\,l=1}^n
A_{kl}B_{kl}$$
where $A_{kl}$ and $B_{kl}$ are
$$A_{kl} = a_{kl}-\bar a_{k.}- \bar a_{.l} + \bar a_{..}$$
$$B_{kl} = b_{kl}-\bar b_{k.}- \bar b_{.l} + \bar b_{..}.$$
Here
$$a_{kl} = \|X_k - X_l\|_p, \quad b_{kl} = \|Y_k - Y_l\|_q, \quad
k,l=1,\dots,n,$$
and the subscript `.`

denotes that the mean is computed for the
index that it replaces. Similarly,
$\mathcal{V}_n(\mathbf{X})$ is the nonnegative number defined by
$$\mathcal{V}^2_n (\mathbf{X}) = \mathcal{V}^2_n (\mathbf{X,X}) =
\frac{1}{n^2} \sum_{k,\,l=1}^n
A_{kl}^2.$$
The empirical distance correlation $\mathcal{R}_n(\mathbf{X,Y})$ is
the square root of
$$\mathcal{R}^2_n(\mathbf{X,Y})=
\frac {\mathcal{V}^2_n(\mathbf{X,Y})}
{\sqrt{ \mathcal{V}^2_n (\mathbf{X}) \mathcal{V}^2_n(\mathbf{Y})}}.$$
See `dcov.test`

for a test of multivariate independence
based on the distance covariance statistic.`dcov.test`

`dcor.ttest`

```
x <- iris[1:50, 1:4]
y <- iris[51:100, 1:4]
dcov(x, y)
dcov(dist(x), dist(y)) #same thing
## C implementation
dcov(x, y, 1.5)
dcor(x, y, 1.5)
.dcov(dist(x), dist(y), 1.5)
## R implementation
DCOR(x, y, 1.5)
## compare speed of R version and C version
set.seed(111)
## R version
system.time(replicate(1000, DCOR(x, y)))
set.seed(111)
## C version
system.time(replicate(1000, .dcov(x, y)))
```

Run the code above in your browser using DataLab