For \(\bm{X} \in \mathbb{R}^{p}\) and \(Y \in \{1, 2, \cdots,
R\}\), the (population-level) semi-distance covariance is defined as
$$\mathrm{SDcov}(\bm{X}, Y) =
\mathrm{E}\left[\|\bm{X}-\widetilde{\bm{X}}\|\left(1-\sum_{r=1}^R
I(Y=r,\widetilde{Y}=r)/p_r\right)\right],$$ where \(p_r = P(Y = r)\) and
\((\widetilde{\bm{X}}, \widetilde{Y})\) is an iid copy of \((\bm{X},
Y)\).
The (population-level) semi-distance correlation is defined as
$$\mathrm{SDcor}(\bm{X}, Y) = \dfrac{\mathrm{SDcov}(\bm{X},
Y)}{\mathrm{dvar}(\bm{X})\sqrt{R-1}},$$ where \(\mathrm{dvar}(\bm{X})\) is
the distance variance (Szekely, Rizzo, and Bakirov 2007) of \(\bm{X}\).
With \(n\) observations \(\{(\bm{X}_i, Y_i)\}_{i=1}^{n}\), sdcov()
and sdcor() can compute the sample estimates for the semi-distance
covariance and correlation.
If type = "V", the semi-distance covariance statistic is computed as a
V-statistic, which takes a very similar form as the energy-based statistic
with double centering, and is always non-negative. Specifically,
$$\text{SDcov}_n(\bm{X}, y) = \frac{1}{n^2} \sum_{k=1}^{n}
\sum_{l=1}^{n} A_{kl} B_{kl},$$
where $$A_{kl} = a_{kl} - \bar{a}_{k.} - \bar{a}_{.l} + \bar{a}_{..}$$
is the double centering (Szekely, Rizzo, and Bakirov 2007) of
\(a_{kl} = \| \bm{X}_k - \bm{X}_l \|,\) and $$B_{kl} =
1 - \sum_{r=1}^{R} I(Y_k = r) I(Y_l = r) / \hat{p}_r$$ with \(\hat{p}_r =
n_r / n = n^{-1}\sum_{i=1}^{n} I(Y_i = r)\).
The semi-distance correlation statistic is $$\text{SDcor}_n(\bm{X}, y)
= \dfrac{\text{SDcov}_n(\bm{X}, y)}{\text{dvar}_n(\bm{X})\sqrt{R - 1}},$$
where \(\text{dvar}_n(\bm{X})\) is the V-statistic of distance variance
of \(\bm{X}\).
If type = "U", then the semi-distance covariance statistic is computed as
an ``estimated U-statistic'', which is utilized in the independence test
statistic and is not necessarily non-negative. Specifically,
$$\widetilde{\text{SDcov}}_n(\bm{X}, y) = \frac{1}{n(n-1)}
\sum_{i \ne j} \| \bm{X}_i - \bm{X}_j \| \left(1 - \sum_{r=1}^{R}
I(Y_i = r) I(Y_j = r) / \tilde{p}_r\right),$$
where \(\tilde{p}_r = (n_r-1) / (n-1) = (n-1)^{-1}(\sum_{i=1}^{n} I(Y_i
= r) - 1)\). Note that the test statistic of the semi-distance independence
test is $$T_n = n \cdot \widetilde{\text{SDcov}}_n(\bm{X}, y).$$