association.test: Association Test

Description

Association tests between phenotype and SNPs.

Usage

association.test(x, Y = x@ped$pheno, X = matrix(1, nrow(x)), method = "lmm", response = c("quantitative", "binary"),  test = c("score", "wald", "lrt"), K, eigenK, beg = 1, end = ncol(x), p = 0, tol = .Machine$double.eps^0.25, multithreaded = FALSE, ...)

Arguments

A bed.matrix

The phenotype vector. Default is the column (pheno) of x@ped

A covariable matrix. The default is a column vector of ones, to include an intercept in the model

method

Method to use. Currently, only "lmm" is implemented

response

Is "Y" a quantitative or a binary phenotype?

test

Which test to use. For binary phenotypes, test = "score" is mandatory

A Genetic Relationship Matrix (as produced by GRM), or a list of such matrices. For test = "score".

eigenK

Eigen decomposition of the Genetic Relationship Matrix (as produced by the function eigen). For test = "wald" or "lrt".

beg

Index of the first SNP tested for association

end

Index of the last SNP tested for association

Number of Principal Components to include in the model with fixed effect (for test = "wald" or "lrt")

tol

Parameter for the likelihood maximization (as in optimize)

multithreaded

Logical. If TRUE, several threads are used.

...

Additional parameters for lmm.aireml or logistic.mm.aireml (if test = "score").

Value

A data frame, giving for each considered SNP:

Details

Tests the association between the phenotype and requested SNPs in x.

For quantitative traits (response = "quantitative"), the following model in considered for each SNP $$ Y = (X|PC)\alpha + G\beta + \omega + \varepsilon $$ with $omega ~ N(0, tau K)$ and $epsilon ~ N(0, sigma^2 I_n)$. $G$ is the genotype vector of the SNP, $K$ is a Genetic Relationship Matrix (GRM) $X$ the covariable matrix, and $PC$ the matrix of the first $p$ principal components.

If test = "score", all parameters are estimated with the same procedure as in lmm.aireml and the argument K is used to specify the GRM matrix (or a list of GRM matrices for inclusion of several random effects in the model). The argument p is ignored. For Wald and LRT tests the procedure used is the same as in lmm.diago and eigenK is used to specify the GRM matrix.

For binary traits (response = "binary"), the following model in considered for each SNP $$ \mbox{logit}(P[Y=1| X, G, \omega]) = X\alpha + G\beta + \omega$$ with $omega ~ N(0, tau K)$. $G$ is the genotype vector of the SNP, $K$ is a Genetic Relationship Matrix (GRM), $X$ the covariable matrix. All parameters under null model are estimated with the same procedure as in logistic.mm.aireml. It is possible to give a list in argument K for inclusion of several random effects in the model. The argument p is ignored.

Note: in the present implementation, using several threads is possible for Wald and LRT tests, but is likely to be slower than using only one, hence the multithreaded parameter. This might change in the future.

Examples

Run this code


# Load data
data(AGT)
x <- as.bed.matrix(AGT.gen, AGT.fam, AGT.bim)
standardize(x) <- "p"

# generate a random positive matrix (to play the role of the GRM)
set.seed(1)
R <- random.pm(nrow(x))

# simulate quantitative phenotype with effect of SNP #351 and a polygenic component 
y <- x %*% c(rep(0,350),0.25,rep(0,ncol(x)-351)) + lmm.simu(0.3,1,eigenK=R$eigen)$y

# association test (default is quantitative response, score test)
t_score <- association.test(x, y, K = R$K)
t_wald <- association.test(x, y, eigenK = R$eigen, test = "wald")

# comparaison des p-valeurs obtenues par les deux tests
plot( t_score$p, t_wald$p, log = "xy", xlab = "score", ylab = "wald")

# (mini) Manhattan plot
plot(-log10(t_score$p), xlab="SNP index", ylab = "-log(p)")

# link between p-values and LD with SNP #351
lds <- LD(x, 351, c(1,ncol(x)))
plot(lds, -log10(t_score$p), xlab="r^2", ylab="-log(p)")

# use y to simulate a binary phenotype
y1 <- as.numeric(y > mean(y))
t_binary <- association.test(x, y1, K = R$K, response = "binary")
# (mini) Manhattan plot
plot(-log10(t_binary$p), xlab="SNP index", ylab = "-log(p)")

Run the code above in your browser using DataLab