Function under optimization, complimentary statistics and loops written in C++ to speed up \(gwas\), \(gibbs\) and \(wgr\).
Some of the functions available for users include:
01) Import_data(file,type=c('GBS','HapMap','VCF')):
This function can be used to import genotypic data in the NAM format, providing a list with a genotypic matrix gen
coded as 012
and a vector chr
with count of markers per chromosome. Currently, it helps users to import three types of files: GBS text, HapMap and VCF.
02) markov(gen,chr):
Imputation method based forwards Markov model for SNP data coded as 012
. We recommend users to remove non-segregating markers before using this function.
03) LD(gen):
Computes the linkage disequilibrium in terms of r2 for SNP data coded as 012
. Missing data is not allowed.
04) PedMat(ped):
Builds a kinship from a pedigree. Input format is provided with PedMat()
.
05) PedMat2(ped,gen=NULL,IgnoreInbr=FALSE,PureLines=FALSE):
Builds a kinship from a genomic data and pedigree. Useful when not all individuals are genotyped. Row names of gen
must indicate the genotype id.
06) Gdist(gen, method = 1):
Computes genetic distance among individuals. Five methods are available: 1) Nei distance; 2) Edwards distance; 3) Reynolds distance; 4) Rogers distance; 5) Provesti's distance. 6) Modified Rogers distance
07) covar(sp=NULL,rho=3.5,type=1,dist=2.5):
Builds a spatial kernel from field plot information. Input format is provided with covar()
. Parameter rho
detemines the decay of relationship among neighbor plots. type
defines if the kernel is exponential (1), Gaussian (2) or some intermediate. dist
informs the distance ratio between range neighbors and row neighbors.
08) eigX(gen,fam):
Computes the input of the argument EIG
of the function gwas2
.
09) G2A_Kernels(gen):
Computes a list of orthogonal kernels containing additive, dominant and first-order epistatic effects, in accordance to the G2A model from ZB Zeng et al. 2005. These kernels can be used for description of genetic architecture through variance components, for that we recommend packages varComp
and BGLR
.
10) NNsrc(sp=NULL,rho=1,dist=3):
Using the same field data input required by the function covar
, this function provides a list of nearest neighbor plots for each entry.
11) NNcov(NN,y):
This function utilizes the output of NNsrc
to generate a numeric vector, averageing the observed values of y
. This function is useful to generate field covariates to control micro-environmental variance without krigging.
11) emXX(y,gen,...):
Fits whole-genome regressions using the expectation-maximization algorithm as opposed to MCMC. Currently avaible methods include BayesA (emBA
), BayesB (emBB
), BayesC (emBC
), BLASSO (emBL
), BLASSO2 (emDE
), Elastic-Net (emEN
), and Ridge regression (emRR
). L2 variable-selection methods, emBB and emBC, are not stable.
12) CNT(X):
Centralizes parameters from matrix X
. Useful to convert allele coding from 012 or -101 into G2A coding. This function does not return anything, rather it modifies X
directly.
13) IMP(X):
Imputes missing points from matrix X
with the average value of the column. This function does not return anything, rather it modifies X
directly.
14) GAU(X):
Created a Gaussian kernel from matrix X
.
15) MSX(X):
Computes the cross-product of each column of X
and the sum of variances of each column of X
.
16) NOR(y,X,cxx,xx,maxit=50,tol=10e-6):
Solves a ridge regression using GSRU, where y
corresponds to the response variable, X
is the set of parameters, cxx
and xx
are the output from the MSX function, maxit
and tol
are the convergence criteria.
17) SPC(y,blk,row,col,rN=3,cN=1):
Computes a spatial covariate, similar to what could be obtained using NNsrc and NNcov but in a single step. It often is faster than NNsrc/NNcov.
# NOT RUN {
# Forward gen imputation
data(tpod)
fast.impute = markov(gen,chr)
# A matrix
PedMat()
# Pairwise LD
ld = LD(gen[,1:10])
heatmap(ld)
# Spatial correlation (kernel-based)
covar()
# Spatial correlation (NN-based)
NNsrc()
# Genetic distance
round(Gdist(gen[1:10,],method=1),2)
# PCs of a NAM kinship
eG = eigX(gen,fam)
plot(eG[[2]],col=fam)
# Polygenic kinship matrices
Ks = G2A_Kernels(gen)
ls(Ks)
# Genomic regression fitted via EM
h = emBA(y,gen)
plot(h$b,pch=20)
# }
Run the code above in your browser using DataLab