mgc.test: MGC Permutation Test

Description

Test of Dependence using MGC Approach.

Usage

mgc.test(
  X,
  Y,
  is.dist.X = FALSE,
  dist.xfm.X = mgc.distance,
  dist.params.X = list(method = "euclidean"),
  dist.return.X = NULL,
  is.dist.Y = FALSE,
  dist.xfm.Y = mgc.distance,
  dist.params.Y = list(method = "euclidean"),
  dist.return.Y = NULL,
  nperm = 1000,
  option = "mgc",
  no_cores = 1
)

Arguments

is interpreted as:

a [n x d] data matrix: X is a data matrix with n samples in d dimensions, if flag is.dist.X=FALSE.
a [n x n] distance matrix: X is a distance matrix. Use flag is.dist.X=TRUE.

is interpreted as:

a [n x d] data matrix: Y is a data matrix with n samples in d dimensions, if flag is.dist.Y=FALSE.
a [n x n] distance matrix: Y is a distance matrix. Use flag is.dist.Y=TRUE.

is.dist.X

a boolean indicating whether your X input is a distance matrix or not. Defaults to FALSE.

dist.xfm.X

if is.dist == FALSE, a distance function to transform X. If a distance function is passed, it should accept an [n x d] matrix of n samples in d dimensions and return a [n x n] distance matrix as the $D return argument. See mgc.distance for details.

dist.params.X

a list of trailing arguments to pass to the distance function specified in dist.xfm.X. Defaults to list(method='euclidean').

dist.return.X

the return argument for the specified dist.xfm.X containing the distance matrix. Defaults to FALSE.

is.null(dist.return): use the return argument directly from dist.xfm as the distance matrix. Should be a [n x n] matrix.
is.character(dist.return) | is.integer(dist.return): use dist.xfm.X[[dist.return]] as the distance matrix. Should be a [n x n] matrix.

is.dist.Y

a boolean indicating whether your Y input is a distance matrix or not. Defaults to FALSE.

dist.xfm.Y

if is.dist == FALSE, a distance function to transform Y. If a distance function is passed, it should accept an [n x d] matrix of n samples in d dimensions and return a [n x n] distance matrix as the dist.return.Y return argument. See mgc.distance for details.

dist.params.Y

a list of trailing arguments to pass to the distance function specified in dist.xfm.Y. Defaults to list(method='euclidean').

dist.return.Y

the return argument for the specified dist.xfm.Y containing the distance matrix. Defaults to FALSE.

is.null(dist.return): use the return argument directly from dist.xfm.Y(Y) as the distance matrix. Should be a [n x n] matrix.
is.character(dist.return) | is.integer(dist.return): use dist.xfm.Y(Y)[[dist.return]] as the distance matrix. Should be a [n x n] matrix.

nperm

specifies the number of replicates to use for the permutation test. Defaults to 1000.

option

is a string that specifies which global correlation to build up-on. Defaults to 'mgc'.

'mgc': use the MGC global correlation.
'dcor': use the dcor global correlation.
'mantel': use the mantel global correlation.
'rank': use the rank global correlation.

no_cores

the number of cores to use for the permutations. Defaults to 1.

Value

A list containing the following:

p.value

P-value of MGC

stat

is the sample MGC statistic within [-1,1]

p.localCorr

P-value of the local correlations by double matrix index.

localCorr

the local correlations

optimalScale

the optimal scale identified by MGC

option

specifies which global correlation was used

Details

A test of independence using the MGC approach, described in Vogelstein et al. (2019). For $X \sim F_X$, $Y \sim F_Y$:

$$H_0: F_X \neq F_Y$$ and: $$H_A: F_X = F_Y$$

Note that one should avoid report positive discovery via minimizing individual p-values of local correlations, unless corrected for multiple hypotheses.

For details on usage see the help vignette: vignette("mgc", package = "mgc")

References

Joshua T. Vogelstein, et al. "Discovering and deciphering relationships across disparate data modalities." eLife (2019).

Examples

Run this code

# NOT RUN {
library(mgc)

n = 100; d = 2
data <- mgc.sims.linear(n, d)
# note: on real data, one would put nperm much higher (at least 100)
# nperm is set to 10 merely for demonstration purposes
result <- mgc.test(data$X, data$Y, nperm=10)
# }

Run the code above in your browser using DataLab