GWA: Genome-wide association analysis

Description

Performs genome-wide association analysis based on the mixed model $$y = X \beta + Z u + \varepsilon$$ where $\beta$ is a vector of fixed effects that can model both environmental factors and population structure. The variable $u$ models the genetic background of each line as a random effect with $Var[u] = K \sigma^2$. The residual variance is $Var[\varepsilon] = I \sigma_e^2$.

Usage

GWA(y, G, Z, X = NULL, K = NULL, min.MAF = 0.01, 
     check.rank = "FALSE")

Arguments

Vector ($n \times 1$) of observations

Matrix ($t \times m$) of genotypes for $t$ lines with $m$ bi-allelic markers. Genotypes should be coded as {-1,0,1} = {aa,Aa,AA}.

0-1 matrix ($n \times t$) relating observations to lines

Design matrix ($n \times p$) for the fixed effects. If not passed, a vector of 1's is used to model the intercept.

Kinship matrix for the population; must be positive semidefinite. If not passed, $K = G G' / m$.

min.MAF

Specifies the minimum minor allele frequency (MAF). If a marker has a MAF less than min.MAF, it is assigned a zero score.

check.rank

If "TRUE", function will check the rank of the augmented design matrix for each marker. Markers for which the design matrix is singular are assigned a zero score.

Value

Returns $m \times 1$ vector of the marker scores, which equal $-log_{10}$(p-value)

Details

This function uses the iterative, generalized least-squares approach of Kang et al. (2010). The use of a minimum MAF is typically adequate to ensure the problem is well-posed. However, if an error message indicates the problem is singular, set check.rank to "TRUE". This will slow down the algorithm but should fix the error.

References

Kang et al. 2010. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42:348-354. Endelman, J.B. (submitted) Coupling ridge regression-BLUP and association analysis to predict complex traits.

Examples

Run this code

data(wheat.G)
data(wheat.y)
n <- nrow(wheat.G)
scores <- GWA(wheat.y,G=wheat.G[,1:100],Z=diag(n))

Run the code above in your browser using DataLab