scantwo: Two-dimensional genome scan with a two-QTL model

Description

Perform a two-dimensional genome scan with a two-QTL model, with possible allowance for covariates. Warning: two-dimensional genome scans on a dense grid can take a great deal of computer time and memory.

Usage

scantwo(cross, chr, pheno.col=1,
        method=c("em","imp","hk","mr","mr-imp","mr-argmax"),
        model=c("normal","binary"),
        addcovar=NULL, intcovar=NULL, weights=NULL,
        run.scanone=TRUE, incl.markers=FALSE, maxit=4000, tol=1e-4,
        verbose=TRUE, n.perm)

Arguments

cross

An object of class cross. See read.cross for details.

chr

Vector indicating the chromosomes for which LOD scores should be calculated.

pheno.col

Column number in the phenotype matrix which should be used as the phenotype.

method

Indicates whether to use standard interval mapping (ie the EM algorithm), imputation, Haley-Knott regression, or marker regression. Marker regression is performed either by dropping individuals with missing genotypes ("mr"), or by f

model

The phenotypic model: the usual normal model or a model for binary traits.

addcovar

Additive covariates.

intcovar

Interactive covariates (interact with QTL genotype).

weights

Optional weights of individuals. Should be either NULL or a vector of length n.ind containing positive weights.

run.scanone

If TRUE, run the function scanone and place the results on the diagonal.

incl.markers

If FALSE, do calculations only at points on an evenly spaced grid.

maxit

Maximum number of iterations in the EM algorithm; used only with method "em".

tol

Tolerance value for determining convergence in the EM algorithm; used only with method "em".

verbose

If TRUE, display information about the progress of calculations. For method "em", if verbose is an integer above 1, further details on the progress of the algorithm will be displayed.

n.perm

If specified, a permutation test is performed rather than an analysis of the observed data. This argument defines the number of permutation replicates.

Value

If n.perm is missing, the function returns a list with class "scantwo" and containing three components. The first component is a matrix of dimension [tot.pos x tot.pos] whose upper triangle contains the epistasis LOD scores and whose lower triangle contains the joint LOD scores. If run.scanone=TRUE, the diagonal contains the results of scanone. The second component of the output is a data.frame indicating the locations at which the two-QTL LOD scores were calculated. The first column is the chromosome identifier, the second column is the position in cM, the third column is a 1/0 indicator for ease in later pulling out only the equally spaced positions, and the fourth column indicates whether the position is on the X chromosome or not. The final component is a version of the results of scanone including sex and/or cross direction as additive covariates, which is needed for a proper calculation of conditional LOD scores.
If n.perm is specified, the function returns a matrix with two columns, containing the maximum joint and epistasis LOD scores, across a two-dimensional scan, for each of the permutation replicates.

X chromosome

The X chromosome must be treated specially in QTL mapping.

As in scanone, if both males and females are included, male hemizygotes are allowed to be different from female homozygotes, and the null hypothesis must be changed in order to ensure that sex- or pgm-differences in the phenotype do not results in spurious linkage to the X chromosome.

Details

The interval mapping (i.e. EM algorithm) and Haley-Knott regression methods require that multipoint genotype probabilities are first calculated using calc.genoprob. The imputation method uses the results of sim.geno.

The method em is standard interval mapping by the EM algorithm (Dempster et al. 1977; Lander and Botstein 1989). Marker regression is simply linear regression of phenotypes on marker genotypes (individuals with missing genotypes are discarded). Haley-Knott regression uses the regression of phenotypes on multipoint genotype probabilities. The imputation method uses the pseudomarker algorithm described by Sen and Churchill (2001). Individuals with missing phenotypes are dropped.

In the presence of covariates, the full model is $$y = \mu + \beta_{q1} + \beta_{q2} + \beta_{q1 \times q2} + A \gamma + Z \delta_{q1} + Z \delta_{q2} + Z \delta_{q1 \times q2} + \epsilon$$ where q1 and q2 are the unknown QTL genotypes at two locations, A is a matrix of covariates, and Z is a matrix of covariates that interact with QTL genotypes. The columns of Z are forced to be contained in the matrix A.

We calculate LOD scores testing comparing the full model to each of two alternatives. The joint LOD score compares the full model to the following null model: $$y = \mu + A \gamma + \epsilon$$ The epistasis LOD score compares the full model to the following additive model: $$y = \mu + \beta_{q1} + \beta_{q2} + A \gamma + Z \delta_{q1} + Z \delta_{q2} + \epsilon$$

In the case that n.perm is specified, the R function scantwo is called repeatedly.

References

Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963--971.

Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39, 1--38.

Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315--324.

Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185--199.

Sen, S. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371--387.

Soller, M., Brody, T. and Genizi, A. (1976) On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines. Theor. Appl. Genet. 47, 35--39.

Examples

Run this code

data(fake.f2)
fake.f2 <- subset(fake.f2, chr=18:19)
fake.f2 <- calc.genoprob(fake.f2, step=10)
out.2dim <- scantwo(fake.f2, method="hk")
plot(out.2dim)

permo.2dim <- scantwo(fake.f2, method="hk", n.perm=2)permo.2dim <- scantwo(fake.f2, method="hk", n.perm=1000)apply(permo.2dim,2,quantile,0.95)

# covariates
data(fake.bc)
fake.bc <- subset(fake.bc, chr=16:17)
fake.bc <- calc.genoprob(fake.bc, step=10)
ac <- fake.bc$pheno[,c("sex","age")]
ic <- fake.bc$pheno[,"sex"]
out <- scantwo(fake.bc, method="hk", pheno.col=1,
               addcovar=ac, intcovar=ic)
plot(out)

Run the code above in your browser using DataLab