Learn R Programming

sommer (version 1.3)

bag: Creating a fixed effect matrix with significant GWAS markers

Description

This function was designed to create a design matrix with the significant markers found by a GWAS analysis in order to use it in the GBLUP or genomic prediction analysis to increase the predction accuracy. The method is based on Abdollahi Arpanahi et al. (2015) paper were using the top 20 GWAS hits markers and using them as fixed effect, increases the prediction accuracy of the genomic prediction model and denominated bag GBLUP. This phenomena has been explained arguing that the mixed model shrinks too much the effect of markers with big effects and therefore using such markers as fixed effects causes a dramatic increase in the prediction accuracy of a model using them.

Usage

bag(gwasm, nmar=10, threshold=1, pick=FALSE, method="cluster")

Arguments

gwasm
a GWAS model fitted using mmer
nmar
the number of GWAS hits (markers) to be used for designing the incidence matrix. It finds the markers with maximum significance value and uses them to create the design matrix. The default is the top 10 markers.
threshold
a numeric value indicating the minimum significance value to be used for finding the significant markers. the dedault is 1.
pick
a TRUE/FALSE value indicating if the user prefers to pick the peaks by himself. The default is FALSE leaving the peak selection to one of the two methods available. If set to TRUE R will allow the user to pick the peaks by cliking over the peaks and typin
method
one of the two methods available; "cluster" performs peak selection by making clusters using k-means (random clusters), whereas "maximum" takes the markers with highest log p.values and select those for designing the model matrix.

Value

  • If all parameters are correctly indicated the program will return: [object Object]

References

Abdollahi Arpanahi R, Morota G, Valente BD, Kranis A, Rosa GJM, Gianola D. 2015. Assessment of bagging GBLUP for whole genome prediction of broiler chicken traits. Journal of Animal Breeding and Genetics 132:218-228.

Examples

Run this code
####=========================================####
#### For CRAN time limitations most lines in the 
#### examples are silenced with a single '#' mark, 
#### remove them and run the examples
####=========================================####
data(CPdata)
CPpheno <- CPdata$pheno
CPgeno <- CPdata$geno

####=========================================####
#### convert markers to numeric format
####=========================================####
## fit a model including additive and dominance effects
y <- CPpheno$color
Za <- diag(length(y))
A <- A.mat(CPgeno) # additive relationship matrix

####=========================================####
#### identify major genes and create the bagging matrix
####=========================================####

ETA.A <- list(list(Z=Za,K=A))
#ans.GWAS <- mmer(y=y, Z=ETA.A, W=CPgeno)
#summary(ans.GWAS)

####=========================================####
#### run the bag function to design the matrix
#### for top GWAS hits
####=========================================####

#X1 <- bag(ans.GWAS);head(X1); dim(X1)

####=========================================####
#### compare prediction accuracies between
#### GBLUP and bag GBLUP 
####=========================================####
set.seed(1234)
y.trn <- y # for prediction accuracy
ww <- sample(c(1:dim(Za)[1]),72) # delete data for one fifth of the population
y.trn[ww] <- NA

ETA.A <- list(list(Z=Za,K=A))
#ans.A <- mmer(y=y.trn, Z=ETA.A) # GBLUP
#ans.AF <- mmer(y=y.trn, X=X1, Z=ETA.A) # bagging-GBLUP
#cor(ans.A$fitted.y[ww], y[ww], use="pairwise.complete.obs") # GBLUP
#cor(ans.AF$fitted.y[ww], y[ww], use="pairwise.complete.obs") # bagging-GBLUP
#### 11 percent increase in prediction accuracy

Run the code above in your browser using DataLab