TestCor-package: FWER and FDR controlling procedures for multiple correlation tests

Description

The package compiles some multiple testing procedures which theoretically control asymptotically the FWER in the framework of correlation testing. Four tests statistics can be considered: the empirical correlation, the Student statistics, the Fisher's z-transform and the usual Gaussian statistics considering random variables $(X_i-mean(X_i))(X_j-mean(X_j))$. Four methods are implemented: Bonferroni (1935)'s, <U+0160>id<U+00E1>k (1967)'s, Romano & Wolf (2005)'s bootstrap and (Drton & Perlman (2007)'s procedure based on the asymptotic distributions of the test statistics, called MaxTinfty. The package also includes some multiple testing procedures which are related to the control of the FDR : Cai & Liu (2016)'s procedures called LCT-N and LCT-B -which have been proven to control the FDR for correlation tests- and Benjamini & Hochberg (1995)'s -which has no theoretical results in correlation testing.

Arguments

Details

Consider $\lbrace \bold{X}_\ell = (X_{1\ell},\dots X_{p\ell}),\; \ell=1,...,n\rbrace$ a set of $n$ independent and identically distributed $R^p$-valued random variables. Denote data the array containing $\lbrace\mathbf{X_\ell},\; \ell=1,\dots,n\rbrace$, with observation indexes $l$ in row. The aim is to test simultaneously $$(H_{0ij})~ Cor(X_i,X_j)=0 {~~against~~} (H_{1ij})~ Cor(X_i,X_j)\neq 0,~~ i,j=1,..., p,~ i<j.$$ Four tests statistics are implemented: the empirical correlation, the Student statistics, the Fisher's z-transform and the usual test statistics on expectancy considering the product of random variables. They are available in function eval_stat. Next, two main types of procedures are available:

Asymptotically FWER controlling procedures:

Bonferroni (1935)'s method, <U+0160>id<U+00E1>k (1967)'s procedure, Romano & Wolf (2005)'s bootstrap procedure and Drton & Perlman (2007)'s procedure. A description of these methods can be found in Chapter 5 of Roux (2018). To apply these procedures, function ApplyFwerCor can be used as follows:

ApplyFwerCor(data,alpha,stat_test,method), with alpha the desired level of control for FDR and stat_test, method respectively the kind of test statistic and the FDR controlling method. The function returns the list of indexes $\lbrace (i,j), i < j \rbrace$ for which null hypothesis $(H_{0ij})$ is rejected.

Asymptotically FDR controlling procedures:

Cai & Liu(2016)'s two procedures and Benjamini & Hochberg (1995)'s procedure (with no theoretical proof for the latest). To apply these procedures, use function ApplyFdrCor as follows: ApplyFdrCor(data,alpha,stat_test,method) with alpha the desired level of control for FWER and stat_test, method respectively the kind of test statistic and the FDR controlling method. The function returns the list of indexes $\lbrace (i,j), i < j \rbrace$ for which null hypothesis $(H_{0ij})$ is rejected.

Functions SimuFwer and SimuFdr provide simulations of Gaussian random variables for a given correlation matrix and return estimated FWER, FDR, Power and true discovery rate obtained applying one of the procedure above. Some example of results obtained can be found in Chapter 6 of Roux (2018).

References

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), 289-300, https://doi.org/10.1111/j.2517-6161.1995.tb02031.x.

Bonferroni, C. E. (1935). Il calcolo delle assicurazioni su gruppi di teste. Studi in onore del professore salvatore ortu carboni, 13-60.

Cai, T. T., & Liu, W. (2016). Large-scale multiple testing of correlations. Journal of the American Statistical Association, 111(513), 229-240, https://doi.org/10.1080/01621459.2014.999157.

Drton, M., & Perlman, M. D. (2007). Multiple testing and error control in Gaussian graphical model selection. Statistical Science, 22(3), 430-449, https://doi.org/10.1214/088342307000000113.

Romano, J. P., & Wolf, M. (2005). Exact and approximate stepdown methods for multiple hypothesis testing. Journal of the American Statistical Association, 100(469), 94-108, https://doi.org/10.1198/016214504000000539.

Roux, M. (2018). Graph inference by multiple testing with application to Neuroimaging, Ph.D., Universit<U+00E9> Grenoble Alpes, France, https://tel.archives-ouvertes.fr/tel-01971574v1.

<U+0160>id<U+00E1>k, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. Journal of the American Statistical Association, 62(318), 626-633.

Examples

Run this code

# NOT RUN {
# Parameters for simulations
Nsimu  <- 100                # number of Monte-Carlo simulations
seqn   <- seq(100,400,100)   # sample sizes
p      <- 10                 # number of random variables considered
rho    <- 0.3                # value of non-zero correlations
seed   <- 156724
 
corr_theo <- diag(1,p)       # the correlation matrix
corr_theo[1,2:p] <- rho
corr_theo[2:p,1] <- rho               

# Parameters for multiple testing procedure
stat_test <- 'empirical'     # test statistics for correlation tests
method <- 'BootRW'           # FWER controlling procedure
SD <- FALSE                  # logical determining if stepdown is applied
alpha  <- 0.05               # FWER threshold 
Nboot  <- 100                # number of bootstrap or simulated samples

# Simulations and application of the chosen procedure
res <- matrix(0,nrow=length(seqn),ncol=4)
for(i in 1:length(seqn)){
    temp <- SimuFwer(corr_theo,n=seqn[i],Nsimu=Nsimu,alpha=alpha,stat_test=stat_test,
           method='BootRW',Nboot=Nboot,stepdown=SD,seed=seed)
    res[i,] <- temp
}
rownames(res) <- seqn
colnames(res) <- names(temp)

# Display results
par(mfrow=c(1,2))
plot(seqn,res[,'fwer'],type='b',ylim=c(0,max(alpha*1.1,max(res[,'fwer']))),
    main='FWER',ylab='fwer',xlab='number of observations')
plot(seqn,res[,'power'],type='b',ylim=c(0,1.1),
    main='Power',ylab='power',xlab='number of observations')
# }

Run the code above in your browser using DataLab