scantwo: Two-dimensional genome scan with a two-QTL model

Description

Perform a two-dimensional genome scan with a two-QTL model, with possible allowance for covariates. Warning: two-dimensional genome scans on a dense grid can take a great deal of computer time and memory.

Usage

scantwo(cross, chr, pheno.col=1, method=c("em","imp","hk","mr"), 
        addcov=NULL, intcov=NULL, run.scanone=TRUE,
        incl.markers=FALSE, maxit=4000, tol=1e-4,
        trace=TRUE, n.perm)

Arguments

cross

An object of class cross. See read.cross for details.

chr

Vector indicating the chromosomes for which LOD scores should be calculated.

pheno.col

Column number in the phenotype matrix which should be used as the phenotype.

method

Indicates whether to use standard interval mapping (ie the EM algorithm), imputation, Haley-Knott regression, or marker regression.

addcov

Additive covariates.

intcov

Interactive covariates (interact with QTL genotype).

run.scanone

If TRUE, run the function scanone and place the results on the diagonal.

incl.markers

If FALSE, do calculations only at points on an evenly spaced grid.

maxit

Maximum number of iterations in the EM algorithm; used only with method "em".

tol

Tolerance value for determining convergence in the EM algorithm; used only with method "em".

trace

If TRUE, display information about the progress of calculations. For method "em", if trace is an integer above 1, further details on the progress of the algorithm will be displayed.

n.perm

If specified, a permutation test is performed rather than an analysis of the observed data. This argument defines the number of permutation replicates.

Value

If n.perm is missing, the function returns a list with class "scantwo" and containing two components. The first component is a matrix of dimension [tot.pos x tot.pos] whose upper triangle contains the epistasis LOD scores and whose lower triangle contains the joint LOD scores. If run.scanone=TRUE, the diagonal contains the results of scanone. The second component of the output is a data.frame indicating the locations at which the two-QTL LOD scores were calculated. The first column is the chromosome identifier, the second column is the position in cM, and the third column is a 1/0 indicator for ease in later pulling out only the equally spaced positions.
If n.perm is specified, the function returns a matrix with two columns, containing the maximum joint and epistasis LOD scores, across a two-dimensional scan, for each of the permutation replicates.

Details

The interval mapping (i.e. EM algorithm) and Haley-Knott regression methods require that multipoint genotype probabilities are first calculated using calc.genoprob. The imputation method uses the results of sim.geno.

The method em is standard interval mapping by the EM algorithm (Dempster et al. 1977; Lander and Botstein 1989). Marker regression is simply linear regression of phenotypes on marker genotypes (individuals with missing genotypes are discarded). Haley-Knott regression uses the regression of phenotypes on multipoint genotype probabilities. The imputation method uses the pseudomarker algorithm described by Sen and Churchill (2001). Individuals with missing phenotypes are dropped.

In the presence of covariates, the full model is $$y = \mu + \beta_{q1} + \beta_{q2} + \beta_{q1 \times q2} + A \gamma + Z \delta_{q1} + Z \delta_{q2} + Z \delta_{q1 \times q2} + \epsilon$$ where q1 and q2 are the unknown QTL genotypes at two locations, A is a matrix of covariates, and Z is a matrix of covariates that interact with QTL genotypes. The columns of Z are forced to be contained in the matrix A.

We calculate LOD scores testing comparing the full model to each of two alternatives. The joint LOD score compares the full model to the following null model: $$y = \mu + A \gamma + \epsilon$$ The epistasis LOD score compares the full model to the following additive model: $$y = \mu + \beta_{q1} + \beta_{q2} + A \gamma + Z \delta_{q1} + Z \delta_{q2} + \epsilon$$

In the case that n.perm is specified, the R function scantwo is called repeatedly.

References

Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963--971.

Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39, 1--38.

Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315--324.

Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185--199.

Sen, S. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371--387.

Soller, M., Brody, T. and Genizi, A. (1976) On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines. Theor. Appl. Genet. 47, 35--39.

Examples

Run this code

data(fake.f2)
fake.f2 <- calc.genoprob(fake.f2, step=10)
out.2dim <- scantwo(fake.f2, method="hk")
plot(out.2dim)

<testonly>permo.2dim <- scantwo(fake.f2, method="hk", n.perm=3)</testonly>permo.2dim <- scantwo(fake.f2, method="hk",n.perm=1000)apply(permo.2dim,2,quantile,0.95)

# covariates
data(fake.bc)
fake.bc <- calc.genoprob(fake.bc, step=10)
ac <- fake.bc$pheno[,c("sex","age")]
ic <- fake.bc$pheno[,"sex"]
out <- scantwo(fake.bc, method="hk", pheno.col=1,
               addcov=ac, intcov=ic)
plot(out)

Run the code above in your browser using DataLab