Calculates the estimators with respective \((1-\alpha)\)-confidence intervals for the four different variants of the multivariate coefficients (MCV) and their reciprocals by Reyment (1960), Van Valen (1974), Voinov and Nikulin (1996) and Albert and Zhang (2010).
e_mcv(x, conf_level = 0.95)When \(d>1\) (respectively \(d=1\)) a data frame with four rows (one row) corresponding to the four MCVs (the univariate CV)
and six columns containing the estimators C_est for the MCV (CV) and the estimators B_est for their reciprocals as well as the upper and lower bounds of the corresponding
confidence intervals [C_lwr, C_upr] and [B_lwr, B_upr].
a matrix of data of size \(n\times d\).
a confidence level. By default, it is equal to 0.95.
The function e_mcv() calculates four different variants of multivariate coefficient of variation for \(d\)-dimensional data. These variant were introduced by
by Reyment (1960, RR), Van Valen (1974, VV), Voinov and Nikulin (1996, VN) and Albert and Zhang (2010, AZ):
$$
{\widehat C}^{RR}=\sqrt{\frac{(\det\mathbf{\widehat\Sigma})^{1/d}}{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}},\
{\widehat C}^{VV}=\sqrt{\frac{\mathrm{tr}\mathbf{\widehat\Sigma}}{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}},\
{\widehat C}^{VN}=\sqrt{\frac{1}{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}^{-1}\boldsymbol{\widehat\mu}}},\
{\widehat C}^{AZ}=\sqrt{\frac{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}\boldsymbol{\widehat\mu}}{(\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu})^2}},
$$
where \(n\) is the sample size, \(\boldsymbol{\widehat\mu}\) is the empirical mean vector and \(\mathbf{\widehat \Sigma}\) is the empirical covariance matrix:
$$
\boldsymbol{\widehat\mu}_i = \frac{1}{n}\sum_{j=1}^{n} \mathbf{X}_{j},\; \mathbf{\widehat \Sigma} =\frac{1}{n}\sum_{j=1}^{n} (\mathbf{X}_{j} - \boldsymbol{\widehat \mu})(\mathbf{X}_{j} - \boldsymbol{\widehat \mu})^{\top}.
$$
In the univariate case (\(d=1\)), all four variants reduce to coefficient of variation. Furthermore, their reciprocals, the so-called standardized means, are determined:
$$
{\widehat B}^{RR}=\sqrt{\frac{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}{(\det\mathbf{\widehat\Sigma})^{1/d}}},\
{\widehat B}^{VV}=\sqrt{\frac{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}{\mathrm{tr}\mathbf{\widehat\Sigma}}},\
{\widehat B}^{VN}=\sqrt{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}^{-1}\boldsymbol{\widehat\mu}},\
{\widehat B}^{AZ}=\sqrt{\frac{(\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu})^2}{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}\boldsymbol{\widehat\mu}}}.
$$
In addition to the estimators, the respective confidence intervals [C_lwr, C_upr] for a given confidence level \(1-\alpha\) are calculated by the e_mcv() function.
These confidence intervals are based on an asymptotic approximation by a normal distribution, see Ditzhaus and Smaga (2023) for the technical details. These approximations
do not rely on any specific (semi-)parametric assumption on the distribution and are valid nonparametrically, even for tied data.
Albert A., Zhang L. (2010) A novel definition of the multivariate coefficient of variation. Biometrical Journal 52:667-675.
Ditzhaus M., Smaga L. (2023) Inference for all variants of the multivariate coefficient of variation in factorial designs. Preprint https://arxiv.org/abs/2301.12009.
Reyment R.A. (1960) Studies on Nigerian Upper Cretaceous and Lower Tertiary Ostracoda: part 1. Senonian and Maastrichtian Ostracoda, Stockholm Contributions in Geology, vol 7.
Van Valen L. (1974) Multivariate structural statistics in natural history. Journal of Theoretical Biology 45:235-247.
Voinov V., Nikulin M. (1996) Unbiased Estimators and Their Applications, Vol. 2, Multivariate Case. Kluwer, Dordrecht.
# d > 1 (MCVs)
data_set <- lapply(list(iris[iris$Species == "setosa", 1:3],
iris[iris$Species == "versicolor", 1:3],
iris[iris$Species == "virginica", 1:3]),
as.matrix)
lapply(data_set, e_mcv)
# d = 1 (CV)
data_set <- lapply(list(iris[iris$Species == "setosa", 1],
iris[iris$Species == "versicolor", 1],
iris[iris$Species == "virginica", 1]),
as.matrix)
lapply(data_set, e_mcv)
Run the code above in your browser using DataLab