snQTL_test_corrnet: Spectral network quantitative trait loci (snQTL) test

Description

Spectral framework to detect network QTLs affecting the co-expression networks. This is the main function for snQTL test.

Given a list of expression data matrices from samples with different gentoypes, we test whether there are significant difference among three co-expression networks. Statistically, we consider the hypothesis testing task:

$$H_0: N_A = N_B = N_H,$$

where $A,B,H$ refer to different genotypes, $N$ refers to the adjacency matrices corresponding to the co-expression network.

We provide four options for the test statistics, composed by sparse matrix/tensor eigenvalues. We perform permutation test to obtain the empirical p-values for the hypothesis testing.

NOTE: This function is also applicable for generalized cases to compare multiple (K > 3) biological networks. Instead of separating the samples by genotypes, people can separate the samples into K groups based on other interested metrics, e.g., locations, treatments. The generalized hypothesis testing problem becomes $$H_0: N_1 = ... = N_K,$$ where $N_k$ refers to the correlation-based network corresponding to the group k. For consistency, we stick with the original genotype-based setting in this help document. See details and examples for the generalization on the Github manual https://github.com/Marchhu36/snQTL.

Usage

snQTL_test_corrnet(
  exp_list,
  method = c("sum", "sum_square", "max", "tensor"),
  npermute = 100,
  seeds = 1:100,
  stats_seed = NULL,
  rho = 1000,
  sumabs = 0.2,
  niter = 20,
  trace = FALSE,
  adj.beta = -1,
  tensor_iter = 20,
  tensor_tol = 10^(-3),
  trans = FALSE,
  location = NULL
)

Value

a list containing the following:

method: character, recall of the choice of test statistics
res_original: list, test result for non-permuted data, including the recall of method choices, test statistics, and decomposition components
res_permute: list, test results for each permuted data, including the recall of method choices, test statistics, and decomposition components
emp_p_value: number, the empirical p-value from permutation test

Arguments

exp_list: list, a list of expression data from samples with different genotypes; the dimensions for data matrices are n1-by-p, n2-by-p, and n3-by-p, respectively; see "details"
method: character, the choice of test statistics; see "details"
npermute: number, the number of permutations to obtain empirical p-values
seeds: vector, the random seeds for permutation; length of the vector is equal to the npermute
stats_seed: number, the random seed for test statistics calculation with non-permuted data
rho: number, a large positive constant adding to the diagonal elements to ensure positive definiteness in symmetric matrix spectral decomposition
sumabs: number, the number specify the sparsity level in the matrix/tensor eigenvector; sumabs takes value between $1/sqrt(p)$ and 1, where $p$ is the dimension; sumabs$*sqrt(p)$ is the upperbound of the L1 norm of the leading matrix/tensor eigenvector (see symmPMD())
niter: integer, the number of iterations to use in the PMD algorithm (see symmPMD())
trace: logic variable, whether to trace the progress of PMD algorithm (see symmPMD())
adj.beta: number, the power transformation to the correlation matrices (see getDiffMatrix()); particularly, when adj.beta=0, the correlation matrix is used, when adj.beta<0, the covariance matrix is used.
tensor_iter: integer, the maximal number of iteration in SSTD algorithm (see max_iter in SSTD())
tensor_tol: number, a small positive constant for error difference to indicate the SSTD convergence (see tol in SSTD())
trans: logic variable, whether to only consider the trans-correlation (between genes from two different chromosomes or regions); see "details"
location: vector, the (chromosome) locations for genes if trans = TRUE

Details

In exp_list, the data matrices are usually ordered with marker's genotypes AA, BB, and AB. The expression data is usually normalized. We use expression data to generate the Pearson's correlation co-expression networks.

Given the list of co-expression networks, we generate pairwise differential networks $$D_{AB} = N_A - N_B, D_{AH} = N_H - N_A, D_{BH} = N_H - N_B.$$ We use pairwise differential networks to generate the snQTL test statistics.

We provide four options of test statistics with different choices of method:

sum, the sum of sparse leading matrix eigenvalues (sLMEs) of all pairwise differential networks:

$$Stat_sum = \lambda(D_{AB}) + \lambda(D_{AH}) + \lambda(D_{BH}),$$

where $\lambda$ refers to the sLME operation with given sparsity level set up by sumabs.
sum_square, the sum of squared sLMEs:

$$Stat_sumsquare = \lambda^2(D_{AB}) + \lambda^2(D_{AH}) + \lambda^2(D_{BH}).$$
max, the maximal of sLMEs:

$$Stat_max = \max(\lambda(D_{AB}), \lambda(D_{AH}), \lambda(D_{BH})).$$
tensor, the sparse leading tensor eigenvalue (sLTE) of the differential tensor:

$$Stat_tensor = \Lambda(\mathcal{D}),$$

where $\Lambda$ refers to the sLTE operation with given sparsity level set up by sumabs, and $\mathcal{D}$ is the differential tensor composed by stacking three pairwise differential networks.

Additionally, if trans = TRUE, we only consider the trans-correlation between the genes from two different chromosomes or regions in co-expression networks. The entries in correlation matrices $N_{ij} = 0$ if gene i and gene j are from the same chromosome or region. The gene location information is required if trans = TRUE.

References

Hu, J., Weber, J. N., Fuess, L. E., Steinel, N. C., Bolnick, D. I., & Wang, M. (2025). A spectral framework to map QTLs affecting joint differential networks of gene co-expression. PLOS Computational Biology, 21(4), e1012953.

Examples

Run this code

### artificial example
n1 = 50
n2 = 60
n3 = 100

p = 200

location = c(rep(1,20), rep(2, 50), rep(3, 100), rep(4, 30))

## expression data from null
set.seed(0416) # random seeds for example data
exp1 = matrix(rnorm(n1*p, mean = 0, sd = 1), nrow = n1)
exp2 = matrix(rnorm(n2*p, mean = 0, sd = 1), nrow = n2)
exp3 = matrix(rnorm(n3*p, mean = 0, sd = 1), nrow = n3)

exp_list = list(exp1, exp2, exp3)

result = snQTL_test_corrnet(exp_list = exp_list, method = 'tensor',
                          npermute = 30, seeds = 1:30, stats_seed = 0416,
                          trans = TRUE, location = location)

result$emp_p_value

Run the code above in your browser using DataLab