test_rotasym: Tests of rotational symmetry for hyperspherical data

Description

Tests for assessing the rotational symmetry of a unit-norm random vector $\mathbf{X}$ in $S^{p-1}:=\{\mathbf{x}\in R^p:||\mathbf{x}||=1\}$, $p \ge 2$, about a location $\boldsymbol{\theta}\in S^{p-1}$, from a hyperspherical sample $\mathbf{X}_1,\ldots,\mathbf{X}_n\in S^{p-1}$.

The vector $\mathbf{X}$ is said to be rotational symmetric about $\boldsymbol{\theta}$ if the distributions of $\mathbf{OX}$ and $\mathbf{X}$ coincide, where $\mathbf{O}$ is any $p\times p$ rotation matrix that fixes $\boldsymbol{\theta}$, i.e., $\mathbf{O}\boldsymbol{\theta}=\boldsymbol{\theta}$.

Usage

test_rotasym(data, theta = spherical_mean, type = c("sc", "loc",
  "loc_vMF", "hyb", "hyb_vMF")[5], Fisher = FALSE, U = NULL,
  V = NULL)

Arguments

data

hyperspherical data, a matrix of size c(n, p) with unit norm rows. Normalized internally if any row does not have unit norm (with a warning message). NAs are ignored.

theta

either a unit norm vector of size p giving the axis of rotational symmetry (for the specified-$\boldsymbol{\theta}$ case) or a function that implements an estimator $\hat{\boldsymbol{\theta}}$ of $\boldsymbol{\theta}$ (for the unspecified-$\boldsymbol{\theta}$ case). The default calls the spherical_mean function. See examples.

type

a character string (case insensitive) indicating the type of test to conduct:

"sc": "scatter" test based on the statistic $Q_{\boldsymbol{\theta}}^{\mathrm{sc}}$. Evaluates if the covariance matrix of the multivariate signs is isotropic.
"loc": "location" test based on the statistic $Q_{\boldsymbol{\theta}}^{\mathrm{loc}}$. Evaluates if the expectation of the multivariate signs is zero.
"loc_vMF": adapted "location" test, based on the statistic $Q_{\mathrm{vMF}}^{\mathrm{loc}}$.
"hyb": "hybrid" test based on the statistics $Q_{\boldsymbol{\theta}}^{\mathrm{sc}}$ and $Q_{\boldsymbol{\theta}}^{\mathrm{loc}}$.
"hyb_vMF" (default): adapted "hybrid" test based on the statistics $Q_{\boldsymbol{\theta}}^{\mathrm{sc}}$ and $Q_{\mathrm{vMF}}^{\mathrm{loc}}$.

See the details below for further explanations of the tests.

Fisher

if TRUE, then Fisher's method is employed to aggregate the scatter and location tests in the hybrid test, see details below. Otherwise, the hybrid statistic is the sum of the scatter and location statistics. Defaults to FALSE.

multivariate signs of data, a matrix of size c(n, p - 1). Computed if NULL (the default).

cosines of data, a vector of size n. Computed if NULL (the default).

Value

An object of the htest class with the following elements:

statistic: test statistic.
parameter: degrees of freedom of the chi-square distribution appearing in all the null asymptotic distributions.
p.value: $p$-value of the test.
method: information on the type of test performed.
data.name: name of the value of data.
U: multivariate signs of data.
V: cosines of data.

Details

Descriptions of the tests:

The "scatter" test is locally and asymptotically optimal against tangent elliptical alternatives to rotational symmetry. However, it is not consistent against tangent von Mises--Fisher (vMF) alternatives. The asymptotic null distribution of $Q_{\boldsymbol{\theta}}^{\mathrm{sc}}$ is unaffected if $\boldsymbol{\theta}$ is estimated, that is, the asymptotic null distributions of $Q_{\boldsymbol{\theta}}^{\mathrm{sc}}$ and $Q_{\hat{\boldsymbol{\theta}}}^{\mathrm{sc}}$ are the same.
The "location" test is locally and asymptotically most powerful against vMF alternatives to rotational symmetry. However, it is not consistent against tangent elliptical alternatives. The asymptotic null distribution of $Q_{\boldsymbol{\theta}}^{\mathrm{loc}}$ for known $\boldsymbol{\theta}$ (the one implemented in test_rotasym) does change if $\boldsymbol{\theta}$ is estimated by $\hat{\boldsymbol{\theta}}$. Therefore, if the test is performed with an estimated $\boldsymbol{\theta}$ (if theta is a function) $Q_{\hat{\boldsymbol{\theta}}}^{\mathrm{loc}}$ will not be properly calibrated. test_rotasym will give a warning in such case.
The "vMF location" test is a modification of the "location" test designed to make its null asymptotic distribution invariant from the estimation of $\boldsymbol{\theta}$ (as the "scatter" test is). The test is optimal against tangent vMF alternatives with a specific, vMF-based, angular function g_vMF. Despite not being optimal against all tangent vMF alternatives, it is consistent for all of them. As the location test, it is not consistent against tangent elliptical alternatives.
The "hybrid" test combines (see below how) the "scatter" and "location" tests. The test is neither optimal against tangent elliptical nor tangent vMF alternatives, but it is consistent against both. Since it is based on the "location" test, if computed with an estimator $\hat{\boldsymbol{\theta}}$, the test statistic will not be properly calibrated. test_rotasym will give a warning in such case.
The "vMF hybrid" test is the analogous of the "hybrid" test but replaces the "location" test by the "vMF location" test.

The combination of the scatter and location tests in the hybrid tests is done in two different ways:

If Fisher = FALSE, then the scatter and location tests statistics give the hybrid test statistic $$Q^{\mathrm{hyb}}:=Q_{\boldsymbol{\theta}}^{\mathrm{sc}}+ Q_{\boldsymbol{\theta}}^{\mathrm{loc}}.$$
If Fisher = TRUE, then Fisher's method for aggregating independent tests (the two test statistics are independent under rotational symmetry) is considered, resulting the hybrid test statistic: $$Q_{\boldsymbol{\theta}}^{\mathrm{hyb}} :=-2(\log(p_{\mathrm{sc}})+\log(p_{\mathrm{loc}}))$$ where $p_{\mathrm{sc}}$ and $p_{\mathrm{loc}}$ are the $p$-values of the scatter and location tests, respectively.

The hybrid test statistic $Q_{\mathrm{vMF}}^{\mathrm{hyb}}$ follows analogously to $Q_{\boldsymbol{\theta}}^{\mathrm{hyb}}$ by replacing $Q_{\boldsymbol{\theta}}^{\mathrm{loc}}$ with $Q_{\mathrm{vMF}}^{\mathrm{loc}}$.

Finally, recall that the tests are designed to test implications of rotational symmetry. Therefore, the tests are not consistent against all types of alternatives to rotational symmetry.

References

Garc<U+00ED>a-Portugu<U+00E9>s, E., Paindaveine, D., Verdebout, T. (2019) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. arXiv:1706.05030. https://arxiv.org/abs/1706.05030

Examples

Run this code

# NOT RUN {
## Rotational symmetry holds

# Sample data from a vMF (rotational symmetric distribution about mu)
n <- 200
p <- 10
theta <- c(1, rep(0, p - 1))
set.seed(123456789)
data_0 <- r_vMF(n = n, mu = theta, kappa = 1)

# theta known
test_rotasym(data = data_0, theta = theta, type = "sc")
test_rotasym(data = data_0, theta = theta, type = "loc")
test_rotasym(data = data_0, theta = theta, type = "loc_vMF")
test_rotasym(data = data_0, theta = theta, type = "hyb")
test_rotasym(data = data_0, theta = theta, type = "hyb", Fisher = TRUE)
test_rotasym(data = data_0, theta = theta, type = "hyb_vMF")
test_rotasym(data = data_0, theta = theta, type = "hyb_vMF", Fisher = TRUE)

# theta unknown (employs the spherical mean as estimator)
test_rotasym(data = data_0, type = "sc")
test_rotasym(data = data_0, type = "loc") # Warning
test_rotasym(data = data_0, type = "loc_vMF")
test_rotasym(data = data_0, type = "hyb") # Warning
test_rotasym(data = data_0, type = "hyb", Fisher = TRUE) # Warning
test_rotasym(data = data_0, type = "hyb_vMF")
test_rotasym(data = data_0, type = "hyb_vMF", Fisher = TRUE)

## Rotational symmetry does not hold

# Sample non-rotational symmetric data from a tangent-vMF distribution
# The scatter test is blind to these deviations, while the location tests
# are optimal
n <- 200
p <- 10
theta <- c(1, rep(0, p - 1))
mu <- c(rep(0, p - 2), 1)
kappa <- 2
set.seed(123456789)
r_V <- function(n) {
  r_g_vMF(n = n, p = p, kappa = 1)
}
data_1 <- r_TM(n = n, r_V = r_V, theta = theta, mu = mu, kappa = kappa)

# theta known
test_rotasym(data = data_1, theta = theta, type = "sc")
test_rotasym(data = data_1, theta = theta, type = "loc")
test_rotasym(data = data_1, theta = theta, type = "loc_vMF")
test_rotasym(data = data_1, theta = theta, type = "hyb")
test_rotasym(data = data_1, theta = theta, type = "hyb", Fisher = TRUE)
test_rotasym(data = data_1, theta = theta, type = "hyb_vMF")
test_rotasym(data = data_1, theta = theta, type = "hyb_vMF", Fisher = TRUE)

# theta unknown (employs the spherical mean as estimator)
test_rotasym(data = data_1, type = "sc")
test_rotasym(data = data_1, type = "loc") # Warning
test_rotasym(data = data_1, type = "loc_vMF")
test_rotasym(data = data_1, type = "hyb") # Warning
test_rotasym(data = data_1, type = "hyb", Fisher = TRUE) # Warning
test_rotasym(data = data_1, type = "hyb_vMF")
test_rotasym(data = data_1, type = "hyb_vMF", Fisher = TRUE)

# Sample non-rotational symmetric data from a tangent-elliptical distribution
# The location tests are blind to these deviations, while the
# scatter test is optimal
n <- 200
p <- 10
theta <- c(1, rep(0, p - 1))
Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1)
diag(Lambda) <- 1
set.seed(123456789)
r_V <- function(n) {
  r_g_vMF(n = n, p = p, kappa = 1)
}
data_2 <- r_TE(n = n, r_V = r_V, theta = theta, Lambda = Lambda)

# theta known
test_rotasym(data = data_2, theta = theta, type = "sc")
test_rotasym(data = data_2, theta = theta, type = "loc")
test_rotasym(data = data_2, theta = theta, type = "loc_vMF")
test_rotasym(data = data_2, theta = theta, type = "hyb")
test_rotasym(data = data_2, theta = theta, type = "hyb", Fisher = TRUE)
test_rotasym(data = data_2, theta = theta, type = "hyb_vMF")
test_rotasym(data = data_2, theta = theta, type = "hyb_vMF", Fisher = TRUE)

# theta unknown (employs the spherical mean as estimator)
test_rotasym(data = data_2, type = "sc")
test_rotasym(data = data_2, type = "loc") # Warning
test_rotasym(data = data_2, type = "loc_vMF")
test_rotasym(data = data_2, type = "hyb") # Warning
test_rotasym(data = data_2, type = "hyb", Fisher = TRUE) # Warning
test_rotasym(data = data_2, type = "hyb_vMF")
test_rotasym(data = data_2, type = "hyb_vMF", Fisher = TRUE)

## Sunspots births data

# Load data
data("sunspots_births")
sunspots_births$X <-
  cbind(cos(sunspots_births$phi) * cos(sunspots_births$theta),
        cos(sunspots_births$phi) * sin(sunspots_births$theta),
        sin(sunspots_births$phi))

# Test rotational symmetry for the 23rd cycle, specified theta
sunspots_23 <- subset(sunspots_births, cycle == 23)
test_rotasym(data = sunspots_23$X, type = "sc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_23$X, type = "loc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_23$X, type = "hyb", theta = c(0, 0, 1))

# Test rotational symmetry for the 23rd cycle, unspecified theta
spherical_loc_PCA(sunspots_23$X)
test_rotasym(data = sunspots_23$X, type = "sc", theta = spherical_loc_PCA)
test_rotasym(data = sunspots_23$X, type = "loc_vMF",
             theta = spherical_loc_PCA)
test_rotasym(data = sunspots_23$X, type = "hyb_vMF",
             theta = spherical_loc_PCA)

# Test rotational symmetry for the 22nd cycle, specified theta
sunspots_22 <- subset(sunspots_births, cycle == 22)
test_rotasym(data = sunspots_22$X, type = "sc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_22$X, type = "loc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_22$X, type = "hyb", theta = c(0, 0, 1))

# Test rotational symmetry for the 22nd cycle, unspecified theta
spherical_loc_PCA(sunspots_22$X)
test_rotasym(data = sunspots_22$X, type = "sc", theta = spherical_loc_PCA)
test_rotasym(data = sunspots_22$X, type = "loc_vMF",
             theta = spherical_loc_PCA)
test_rotasym(data = sunspots_22$X, type = "hyb_vMF",
             theta = spherical_loc_PCA)
# }

Run the code above in your browser using DataLab