This function implements the two-sample \(l_2\)-norm-based high-dimensional covariance test proposed by Li and Chen (2012). Suppose \(\{\mathbf{X}_1, \ldots, \mathbf{X}_{n_1}\}\) are i.i.d. copies of \(\mathbf{X}\), and \(\{\mathbf{Y}_1, \ldots, \mathbf{Y}_{n_2}\}\) are i.i.d. copies of \(\mathbf{Y}\). The test statistic \(T_{LC}\) is defined as $$T_{LC} = A_{n_1}+B_{n_2}-2C_{n_1,n_2},$$ where \(A_{n_1}\), \(B_{n_2}\), and \(C_{n_1,n_2}\) are unbiased estimators for \(\mathrm{tr}(\mathbf{\Sigma}^2_1)\), \(\mathrm{tr}(\mathbf{\Sigma}^2_2)\), and \(\mathrm{tr}(\mathbf{\Sigma}_1\mathbf{\Sigma}_2)\), respectively. Under the null hypothesis \(H_{0c}: \mathbf{\Sigma}_1 = \mathbf{\Sigma}_2\), the leading variance of \(T_{LC}\) is \(\sigma^2_{T_{LC}} = 4(\frac{1}{n_1}+\frac{1}{n_2})^2 \rm{tr}^2(\mathbf{\Sigma}^2)\), which can be consistently estimated by \(\hat\sigma^2_{LC}\). The explicit formulas of \(A_{n_1}\), \(B_{n_2}\), \(C_{n_1,n_2}\) and \(\hat\sigma^2_{T_{LC}}\) can be found in Equations (2.1), (2.2) and Theorem 1 of Li and Chen (2012). With some regularity conditions, under the null hypothesis \(H_{0c}: \mathbf{\Sigma}_1 = \mathbf{\Sigma}_2\), the test statistic \(T_{LC}\) converges in distribution to a standard normal distribution as \(n_1, n_2, p \rightarrow \infty\). The asymptotic \(p\)-value is obtained by $$p_{LC} = 1-\Phi(T_{LC}/\hat\sigma_{T_{LC}}),$$ where \(\Phi(\cdot)\) is the cdf of the standard normal distribution.
covtest.lc(dataX,dataY)
stat
the value of test statistic
pval
the p-value for the test.
an \(n_1\) by \(p\) data matrix
an \(n_2\) by \(p\) data matrix
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.
n1 = 100; n2 = 100; pp = 500
set.seed(1)
X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp)
Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp)
covtest.lc(X,Y)
Run the code above in your browser using DataLab