Learn R Programming

snQTL (version 0.2)

snQTL_test_corrnet: Spectral network quantitative trait loci (snQTL) test

Description

Spectral framework to detect network QTLs affecting the co-expression networks. This is the main function for snQTL test.

Given a list of expression data matrices from samples with different gentoypes, we test whether there are significant difference among three co-expression networks. Statistically, we consider the hypothesis testing task:

$$H_0: N_A = N_B = N_H,$$

where \(A,B,H\) refer to different genotypes, \(N\) refers to the adjacency matrices corresponding to the co-expression network.

We provide four options for the test statistics, composed by sparse matrix/tensor eigenvalues. We perform permutation test to obtain the empirical p-values for the hypothesis testing.

NOTE: This function is also applicable for generalized cases to compare multiple (K > 3) biological networks. Instead of separating the samples by genotypes, people can separate the samples into K groups based on other interested metrics, e.g., locations, treatments. The generalized hypothesis testing problem becomes $$H_0: N_1 = ... = N_K,$$ where \(N_k\) refers to the correlation-based network corresponding to the group k. For consistency, we stick with the original genotype-based setting in this help document. See details and examples for the generalization on the Github manual https://github.com/Marchhu36/snQTL.

Usage

snQTL_test_corrnet(
  exp_list,
  method = c("sum", "sum_square", "max", "tensor"),
  npermute = 100,
  seeds = 1:100,
  stats_seed = NULL,
  rho = 1000,
  sumabs = 0.2,
  niter = 20,
  trace = FALSE,
  adj.beta = -1,
  tensor_iter = 20,
  tensor_tol = 10^(-3),
  trans = FALSE,
  location = NULL
)

Value

a list containing the following:

method

character, recall of the choice of test statistics

res_original

list, test result for non-permuted data, including the recall of method choices, test statistics, and decomposition components

res_permute

list, test results for each permuted data, including the recall of method choices, test statistics, and decomposition components

emp_p_value

number, the empirical p-value from permutation test

Arguments

exp_list

list, a list of expression data from samples with different genotypes; the dimensions for data matrices are n1-by-p, n2-by-p, and n3-by-p, respectively; see "details"

method

character, the choice of test statistics; see "details"

npermute

number, the number of permutations to obtain empirical p-values

seeds

vector, the random seeds for permutation; length of the vector is equal to the npermute

stats_seed

number, the random seed for test statistics calculation with non-permuted data

rho

number, a large positive constant adding to the diagonal elements to ensure positive definiteness in symmetric matrix spectral decomposition

sumabs

number, the number specify the sparsity level in the matrix/tensor eigenvector; sumabs takes value between \(1/sqrt(p)\) and 1, where \(p\) is the dimension; sumabs\(*sqrt(p)\) is the upperbound of the L1 norm of the leading matrix/tensor eigenvector (see symmPMD())

niter

integer, the number of iterations to use in the PMD algorithm (see symmPMD())

trace

logic variable, whether to trace the progress of PMD algorithm (see symmPMD())

adj.beta

number, the power transformation to the correlation matrices (see getDiffMatrix()); particularly, when adj.beta=0, the correlation matrix is used, when adj.beta<0, the covariance matrix is used.

tensor_iter

integer, the maximal number of iteration in SSTD algorithm (see max_iter in SSTD())

tensor_tol

number, a small positive constant for error difference to indicate the SSTD convergence (see tol in SSTD())

trans

logic variable, whether to only consider the trans-correlation (between genes from two different chromosomes or regions); see "details"

location

vector, the (chromosome) locations for genes if trans = TRUE

Details

In exp_list, the data matrices are usually ordered with marker's genotypes AA, BB, and AB. The expression data is usually normalized. We use expression data to generate the Pearson's correlation co-expression networks.

Given the list of co-expression networks, we generate pairwise differential networks $$D_{AB} = N_A - N_B, D_{AH} = N_H - N_A, D_{BH} = N_H - N_B.$$ We use pairwise differential networks to generate the snQTL test statistics.

We provide four options of test statistics with different choices of method:

  1. sum, the sum of sparse leading matrix eigenvalues (sLMEs) of all pairwise differential networks:

    $$Stat_sum = \lambda(D_{AB}) + \lambda(D_{AH}) + \lambda(D_{BH}),$$

    where \(\lambda\) refers to the sLME operation with given sparsity level set up by sumabs.

  2. sum_square, the sum of squared sLMEs:

    $$Stat_sumsquare = \lambda^2(D_{AB}) + \lambda^2(D_{AH}) + \lambda^2(D_{BH}).$$

  3. max, the maximal of sLMEs:

    $$Stat_max = \max(\lambda(D_{AB}), \lambda(D_{AH}), \lambda(D_{BH})).$$

  4. tensor, the sparse leading tensor eigenvalue (sLTE) of the differential tensor:

    $$Stat_tensor = \Lambda(\mathcal{D}),$$

    where \(\Lambda\) refers to the sLTE operation with given sparsity level set up by sumabs, and \(\mathcal{D}\) is the differential tensor composed by stacking three pairwise differential networks.

Additionally, if trans = TRUE, we only consider the trans-correlation between the genes from two different chromosomes or regions in co-expression networks. The entries in correlation matrices \(N_{ij} = 0\) if gene i and gene j are from the same chromosome or region. The gene location information is required if trans = TRUE.

References

Hu, J., Weber, J. N., Fuess, L. E., Steinel, N. C., Bolnick, D. I., & Wang, M. (2025). A spectral framework to map QTLs affecting joint differential networks of gene co-expression. PLOS Computational Biology, 21(4), e1012953.

Examples

Run this code
### artificial example
n1 = 50
n2 = 60
n3 = 100

p = 200

location = c(rep(1,20), rep(2, 50), rep(3, 100), rep(4, 30))

## expression data from null
set.seed(0416) # random seeds for example data
exp1 = matrix(rnorm(n1*p, mean = 0, sd = 1), nrow = n1)
exp2 = matrix(rnorm(n2*p, mean = 0, sd = 1), nrow = n2)
exp3 = matrix(rnorm(n3*p, mean = 0, sd = 1), nrow = n3)

exp_list = list(exp1, exp2, exp3)

result = snQTL_test_corrnet(exp_list = exp_list, method = 'tensor',
                          npermute = 30, seeds = 1:30, stats_seed = 0416,
                          trans = TRUE, location = location)

result$emp_p_value

Run the code above in your browser using DataLab