Learn R Programming

NRAHDLTP (version 0.1.2)

tsbf_zzz2023: Test proposed by Zhang et al. (2023)

Description

Zhang et al. (2023)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.

Usage

tsbf_zzz2023(y1, y2, cutoff)

Value

A (list) object of S3 class htest containing the following elements:

p.value

the p-value of the test proposed by Zhang et al. (2023)'s test.

statistic

the test statistic proposed by Zhang et al. (2023)'s test.

df

estimated approximate degrees of freedom of Zhang et al. (2023)'s test.

cpn

the adjustment coefficient used in Zhang et al. (2023)'s test.

Arguments

y1

The data matrix (p by n1) from the first population. Each column represents a \(p\)-dimensional sample.

y2

The data matrix (p by n2) from the first population. Each column represents a \(p\)-dimensional sample.

cutoff

An empirical criterion for applying the adjustment coefficient

Details

Suppose we have two independent high-dimensional samples: $$ \boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2. $$ The primary object is to test $$H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.$$ Zhang et al.(2023) proposed the following test statistic: $$T_{ZZZ}=\frac{n_1 n_2}{np}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2)^{\top} \hat{\boldsymbol{D}}_n^{-1}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2),$$ where \(\bar{\boldsymbol{y}}_{i},i=1,2\) are the sample mean vectors, and \(\hat{\boldsymbol{D}}_n=\operatorname{diag}(\hat{\boldsymbol{\Sigma}}_1/n+\hat{\boldsymbol{\Sigma}}_2/n)\) with \(n=n_1+n_2\). They showed that under the null hypothesis, \(T_{ZZZ}\) and a chi-squared-type mixture have the same limiting distribution.

References

zhang2023twoNRAHDLTP

Examples

Run this code
set.seed(1234)
n1 <- 20
n2 <- 30
p <- 50
mu1 <- t(t(rep(0, p)))
mu2 <- mu1
rho1 <- 0.1
rho2 <- 0.2
a1 <- 1
a2 <- 2
w1 <- (-2 * sqrt(a1 * (1 - rho1)) + sqrt(4 * a1 * (1 - rho1) + 4 * p * a1 * rho1)) / (2 * p)
x1 <- w1 + sqrt(a1 * (1 - rho1))
Gamma1 <- matrix(rep(w1, p * p), nrow = p)
diag(Gamma1) <- rep(x1, p)
w2 <- (-2 * sqrt(a2 * (1 - rho2)) + sqrt(4 * a2 * (1 - rho2) + 4 * p * a2 * rho2)) / (2 * p)
x2 <- w2 + sqrt(a2 * (1 - rho2))
Gamma2 <- matrix(rep(w2, p * p), nrow = p)
diag(Gamma2) <- rep(x2, p)
Z1 <- matrix(rnorm(n1*p,mean = 0,sd = 1), p, n1)
Z2 <- matrix(rnorm(n2*p,mean = 0,sd = 1), p, n2)
y1 <- Gamma1 %*% Z1 + mu1%*%(rep(1,n1))
y2 <- Gamma2 %*% Z2 + mu2%*%(rep(1,n2))
tsbf_zzz2023(y1,y2,cutoff=1.2)

Run the code above in your browser using DataLab