Learn R Programming

gaston (version 1.4.8)

association.test: Association Test

Description

Association tests between phenotype and SNPs.

Usage

association.test(x, Y = x@ped$pheno, X = matrix(1, nrow(x)), method = "lmm", response = c("quantitative", "binary"), test = c("score", "wald", "lrt"), K, eigenK, beg = 1, end = ncol(x), p = 0, tol = .Machine$double.eps^0.25, multithreaded = FALSE, ...)

Arguments

Y
The phenotype vector. Default is the column (pheno) of x@ped
X
A covariable matrix. The default is a column vector of ones, to include an intercept in the model
method
Method to use. Currently, only "lmm" is implemented
response
Is "Y" a quantitative or a binary phenotype?
test
Which test to use. For binary phenotypes, test = "score" is mandatory
K
A Genetic Relationship Matrix (as produced by GRM), or a list of such matrices. For test = "score".
eigenK
Eigen decomposition of the Genetic Relationship Matrix (as produced by the function eigen). For test = "wald" or "lrt".
beg
Index of the first SNP tested for association
end
Index of the last SNP tested for association
p
Number of Principal Components to include in the model with fixed effect (for test = "wald" or "lrt")
tol
Parameter for the likelihood maximization (as in optimize)
multithreaded
Logical. If TRUE, several threads are used.
...
Additional parameters for lmm.aireml or logistic.mm.aireml (if test = "score").

Value

A data frame, giving for each considered SNP:

Details

Tests the association between the phenotype and requested SNPs in x.

For quantitative traits (response = "quantitative"), the following model in considered for each SNP $$ Y = (X|PC)\alpha + G\beta + \omega + \varepsilon $$ with $omega ~ N(0, tau K)$ and $epsilon ~ N(0, sigma^2 I_n)$. $G$ is the genotype vector of the SNP, $K$ is a Genetic Relationship Matrix (GRM) $X$ the covariable matrix, and $PC$ the matrix of the first $p$ principal components.

If test = "score", all parameters are estimated with the same procedure as in lmm.aireml and the argument K is used to specify the GRM matrix (or a list of GRM matrices for inclusion of several random effects in the model). The argument p is ignored. For Wald and LRT tests the procedure used is the same as in lmm.diago and eigenK is used to specify the GRM matrix.

For binary traits (response = "binary"), the following model in considered for each SNP $$ \mbox{logit}(P[Y=1| X, G, \omega]) = X\alpha + G\beta + \omega$$ with $omega ~ N(0, tau K)$. $G$ is the genotype vector of the SNP, $K$ is a Genetic Relationship Matrix (GRM), $X$ the covariable matrix. All parameters under null model are estimated with the same procedure as in logistic.mm.aireml. It is possible to give a list in argument K for inclusion of several random effects in the model. The argument p is ignored.

Note: in the present implementation, using several threads is possible for Wald and LRT tests, but is likely to be slower than using only one, hence the multithreaded parameter. This might change in the future.

See Also

lmm.diago, lmm.aireml, logistic.mm.aireml, optimize

Examples

Run this code

# Load data
data(AGT)
x <- as.bed.matrix(AGT.gen, AGT.fam, AGT.bim)
standardize(x) <- "p"

# generate a random positive matrix (to play the role of the GRM)
set.seed(1)
R <- random.pm(nrow(x))

# simulate quantitative phenotype with effect of SNP #351 and a polygenic component 
y <- x %*% c(rep(0,350),0.25,rep(0,ncol(x)-351)) + lmm.simu(0.3,1,eigenK=R$eigen)$y

# association test (default is quantitative response, score test)
t_score <- association.test(x, y, K = R$K)
t_wald <- association.test(x, y, eigenK = R$eigen, test = "wald")

# comparaison des p-valeurs obtenues par les deux tests
plot( t_score$p, t_wald$p, log = "xy", xlab = "score", ylab = "wald")

# (mini) Manhattan plot
plot(-log10(t_score$p), xlab="SNP index", ylab = "-log(p)")

# link between p-values and LD with SNP #351
lds <- LD(x, 351, c(1,ncol(x)))
plot(lds, -log10(t_score$p), xlab="r^2", ylab="-log(p)")

# use y to simulate a binary phenotype
y1 <- as.numeric(y > mean(y))
t_binary <- association.test(x, y1, K = R$K, response = "binary")
# (mini) Manhattan plot
plot(-log10(t_binary$p), xlab="SNP index", ylab = "-log(p)")

Run the code above in your browser using DataLab