Learn R Programming

TechPhD (version 1.0.0)

test_covmat: Test of high-dimensional covariance matrices with longitudinal/functional data

Description

This function implements test procedures proposed by Santo and Zhong (2020), and Zhong, Li, and Santo (2019) for testing the homogeneity of covariance matrices in high-dimensional longitudinal/functional data. Temporal and spatial dependence are allowed. The null hypothesis of the test is that the covariance matrices are homogeneous over time, namely the covariance matrices at different time points are equivalent.

Usage

test_covmat(y, n, p, TT, alpha = 0.01, threads = 1)

Arguments

y

A high-dimensional longitudinal data set in the format of a three dimensional array where the first coordinate is for features, the second coordinate is for sample subjects, and the third coordinate is for time repetitions. Thus, the dimension of y is \(p \times n \times TT\) where \(p\) is the dimension of feauture variables (data dimension), \(n\) is the number of individuals (sample size), and \(TT\) is the number of repetition times.

n

The number of individuals (sample size).

p

The dimension of feature variables (data dimension).

TT

The number of repetition times.

alpha

The type I error of the homogeniety test. Suggested values for alpha include 0.01 (default) and 0.05.

threads

The number of threads for computing. The default value is 1. Change the number of threads to allow parallel computing.

Value

The function returns a test result, estimated change point, test statistic, p-value, and correlation matrix. The output is provided in a list.

$reject

Null hypothesis rejection indicator. A value of 1 indicates the null hypothesis is rejected. The null hypothesis is that all the covariance matrices are equal across time.

$estcp

The first estimated change point provided the null hypothesis is rejected. This value will be 0 if the null hypothesis is not rejected.

$teststat

The test statistic.

$pvalue

The p-value.

$corrmat

The test statistic is a maximum of \(TT-1\) standardized statistics \(\hat{D}_{nt}\) which quantifies the Frobenius norm of covariance matrices before and after time \(t\) for \(t=1,...,TT-1\). The correlation matrix is the correlation matrix among the \(TT-1\) standardized statistics \(\hat{D}_{nt}\). For further details, see Zhong, Li, and Santo (2019).

Details

The methodology and procedure aim to test the homogeneity among covariance matrices in high dimensional longitudinal/functional data. The method allows data dimension much larger than the sample size and the number of repeated measurements. It can also accommodate general spatial and temporal dependence. For details of the proposed procedures, please read Zhong, Li and Santo (2019), and Santo and Zhong (2020).

References

Zhong, Li, and Santo (2019). Homogeneity tests of covariance matrices with high-dimensional longitudinal data. Biometrika, 106, 619-634

Santo and Zhong (2020). Homogeneity tests of covariance and change-points identification for high-dimensional functional data. arXiv:2005.01895

Examples

Run this code
# NOT RUN {
# A testing example with a change point at time 2

# Set parameters
p <- 30; n <- 10; TT <- 5
delta <- 0.35
m <- p+20; L <- 3; k0 <- 2; w <- 0.2

# Generate data
Gamma1 <- Gamma2 <- matrix(0, p, m * L)
y <- array(0, c(p, n, TT))
set.seed(928)

for (i in 1:p){
  for (j in 1:p){
    dij <- abs(i - j)

    if (dij < (p * w)){
      Gamma1[i, j] <- (dij + 1) ^ (-2)
      Gamma2[i, j] <- (dij + 1 + delta) ^ (-2)
    }
  }
}

Z <- matrix(rnorm(m * (TT + L - 1) * n), m * (TT + L - 1), n)

for (t in 1:k0){
  y[, , t] <- Gamma1 %*% Z[((t - 1) * m + 1):((t + L - 1) * m), ]
}
for (t in (k0+1):TT){
  y[, , t] <- Gamma2 %*% Z[((t - 1) * m + 1):((t + L - 1) * m), ]
}

test_covmat(y, n, p, TT, alpha = 0.01)
# }

Run the code above in your browser using DataLab