linearRegression(phen, gen=NULL, genZ=NULL,
reference="noia", max.level=NULL, max.dom=NULL, fast=FALSE)
multilinearRegression(phen, gen=NULL, genZ=NULL,
reference="noia", max.level=NULL, max.dom=NULL, fast=FALSE,
e.unique=FALSE, start.algo = "linear", start.values=NULL,
robust=FALSE, bilinear.steps=1, ...)
prepareRegression(phen, gen=NULL, genZ=NULL,
reference="noia", max.level=NULL, max.dom=NULL, fast=FALSE)
genNames
for the genotype encoding. Not necessary if genZ
is provided.gen
is provided."noia"
reference point is used, since it provides a fairly good orthogonality. Other possibilities are "G2A"
, "F2"
, "F1"
, max.level
. In the multilinear regression, the maximum level for dominance effects cannot be > 1.
"linear"
, "multilinear"
, "subset"
or "bilinear"
. Ignored if start.values
are provided.bilinearStep
function. Ignored if start.algo
is not "bilinear"
. If NULL
, the bilinear algorithm is run until (almost) convergence.nls
, including nls.control
.linearRegression
and multilinearRegression
return an object of class "noia.linear"
or "noia.multilinear"
, both having their own print
methods: print.noia.linear
and print.noia.multilinear
.gen
data set is provided, it will be turned into a genZ
through the gen2genZ
function. Missing data (unknown genotypes) are considered as loci for which genotypic probabilities are identical to the genotypic frequencies in the population.
The algebraic framework is described extensively in Alvarez-Castro & Carlborg 2007. The default reference point ("noia"
) provides an orthogonal decomposition of genetic effects in the 1-locus case, whatever the genotypic frequencies. It remains a good approximation of orthogonality in the multi-locus case if linkage disequilibrium is small. Other optional reference points are those of the "G2A"
model (Zeng et al. 2005), and the unweighted regression model "UWR"
(Cheverud & Routman, 1995). Several key populations can be taken as reference as well: "F2"
, "F1"
, "Finf"
(F infinity), and the two "parental" homozygous populations "P1"
and "P2"
.
The multilinear model for genetic interactions is an alternative way to model epistatic interactions between at least two loci (see Hansen & Wagner 2001). The computation of multilinear estimates requires a non-linear regression step that relies on the nls
function. Providing good starting values for the non-linear regression is a key to ensure convergence, and different algorithms are provided, that can be specified by the "start.algo"
option. "linear"
performs a linear regression and approximates the genetic effects from it, while "multilinear"
performs a simpler multilinear regression (without dominance) to initialize the genetic effects. "subset"
estimate all genetic effects from a random subset (50%) of the population, and "bilinear"
estimate alternatively marginal and epistatic effects. See startingValues
for more information.
prepareRegression
performs all preliminary calculation on the dataset but does not run any regression. It is called internally by both linearRegression
and multilinearRegression
.geneticEffects
, GPmap
, varianceDecomposition
.set.seed(123456789)
map <- c(0.25, -0.75, -0.75, -0.75, 2.25, 2.25, -0.75, 2.25, 2.25)
pop <- simulatePop(map, N=500, sigmaE=0.2, type="F2")
# Regressions
linear <- linearRegression(phen=pop$phen, gen=cbind(pop$Loc1, pop$Loc2))
multilinear <- multilinearRegression(phen=pop$phen,
gen=cbind(pop$Loc1, pop$Loc2))
# Linear effects, associated variances and stderr
linear
# Multilinear effects
multilinear
Run the code above in your browser using DataLab