meantest.pe.comp: Two-sample PE mean test for high-dimensional data via PE component

Description

This function implements the two-sample PE mean via the construction of the PE component. Let $M_{CQ}/\hat\sigma_{M_{CQ}}$ denote the $l_2$-norm-based mean test statistic (see meantest.cq for details). The PE component is constructed by $$J_m = \sqrt{p}\sum_{i=1}^p M_i\widehat\nu^{-1/2}_i \mathcal{I}\{ \sqrt{2}M_i\widehat\nu^{-1/2}_i + 1 > \delta_{mean} \}, $$ where $\delta_{mean}$ is a threshold for the screening procedure, recommended to take the value of $\delta_{mean}=2\log(\log (n_1+n_2))\log p$. The explicit forms of $M_{i}$ and $\widehat\nu_{j}$ can be found in Section 3.1 of Yu et al. (2022). The PE covariance test statistic is defined as $$M_{PE}=M_{CQ}/\hat\sigma_{M_{CQ}}+J_m.$$ With some regularity conditions, under the null hypothesis $H_{0m}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2$, the test statistic $M_{PE}$ converges in distribution to a standard normal distribution as $n_1, n_2, p \rightarrow \infty$. The asymptotic $p$-value is obtained by $$p\text{-value}= 1-\Phi(M_{PE}),$$ where $\Phi(\cdot)$ is the cdf of the standard normal distribution.

Usage

meantest.pe.comp(dataX,dataY,delta=NULL)

Value

stat the value of test statistic

pval the p-value for the test.

Arguments

dataX: an $n_1$ by $p$ data matrix
dataY: an $n_2$ by $p$ data matrix
delta: a scalar; the thresholding value used in the construction of the PE component. If not specified, the function uses a default value $\delta_{mean}=2\log(\log (n_1+n_2))\log p$.

References

Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.

Examples

Run this code

n1 = 100; n2 = 100; pp = 500
set.seed(1)
X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp)
Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp)
meantest.pe.comp(X,Y)

Run the code above in your browser using DataLab