Learn R Programming

qtl (version 1.66)

sim.cross: Simulate a QTL experiment

Description

Simulates data for a QTL experiment using a model in which QTLs act additively.

Usage

sim.cross(map, model=NULL, n.ind=100,
         type=c("f2", "bc", "4way", "risib", "riself",
           "ri4sib", "ri4self", "ri8sib", "ri8self", "bcsft"),
          error.prob=0, missing.prob=0, partial.missing.prob=0,
          keep.qtlgeno=TRUE, keep.errorind=TRUE, m=0, p=0,
      map.function=c("haldane","kosambi","c-f","morgan"),
          founderGeno, random.cross=TRUE, ...)

Value

An object of class cross. See read.cross for details.

If keep.qtlgeno is TRUE, the cross object will contain a component qtlgeno which is a matrix containing the QTL genotypes (with complete data and no errors), coded as in the genotype data.

If keep.errorind is TRUE and errors were simulated, each component of geno will each contain a matrix errors, with 1's indicating simulated genotyping errors.

Arguments

map

A list whose components are vectors containing the marker locations on each of the chromosomes.

model

A matrix where each row corresponds to a different QTL, and gives the chromosome number, cM position and effects of the QTL.

n.ind

Number of individuals to simulate.

type

Indicates whether to simulate an intercross (f2), a backcross (bc), a phase-known 4-way cross (4way), or recombinant inbred lines (by selfing or by sib-mating, and with the usual 2 founder strains or with 4 or 8 founder strains).

error.prob

The genotyping error rate.

missing.prob

The rate of missing genotypes.

partial.missing.prob

When simulating an intercross or 4-way cross, this gives the rate at which markers will be incompletely informative (i.e., dominant or recessive).

keep.qtlgeno

If TRUE, genotypes for the simulated QTLs will be included in the output.

keep.errorind

If TRUE, and if error.prob > 0, the identity of genotyping errors will be included in the output.

m

Interference parameter; a non-negative integer. 0 corresponds to no interference.

p

Probability that a chiasma comes from the no-interference mechanism

map.function

Indicates whether to use the Haldane, Kosambi, Carter-Falconer, or Morgan map function when converting genetic distances into recombination fractions.

founderGeno

For 4- or 8-way RIL, the genotype data of the founder strains, as a list whose components are numeric matrices (no. markers x no. founders), one for each chromosome.

random.cross

For 4- or 8-way RIL, indicates whether the order of the founder strains should be randomized, independently for each RIL, or whether all RIL be derived from a common cross. In the latter case, for a 4-way RIL, the cross would be (AxB)x(CxD).

...

For type = "bcsft", additional arguments passed to sim.cross.bcsft.

Recombinant inbred lines

In the simulation of recombinant inbred lines (RIL), we simulate a single individual from each line, and no phenotypes are simulated (so the argument model is ignored).

The types riself and risib are the usual two-way RIL.

The types ri4self, ri4sib, ri8self, and ri8sib are RIL by selfing or sib-mating derived from four or eight founding parental strains.

For the 4- and 8-way RIL, one must include the genotypes of the founding individuals; these may be simulated with simFounderSnps. Also, the output cross will contain a component cross, which is a matrix with rows corresponding to RIL and columns corresponding to the founders, indicating order of the founder strains in the crosses used to generate the RIL.

The coding of genotypes in 4- and 8-way RIL is rather complicated. It is a binary encoding of which founder strains' genotypes match the RIL's genotype at a marker, and not that this is specific to the order of the founders in the crosses used to generate the RIL. For example, if an RIL generated from 4 founders has the 1 allele at a SNP, and the four founders have SNP alleles 0, 1, 0, 1, then the RIL allele matches that of founders B and D. If the RIL was derived by the cross (AxB)x(CxD), then the RIL genotype would be encoded \(2^{2-1} + 2^{3-1} = 6\). If the cross was derived by the cross (DxA)x(CxB), then the RIL genotype would be encoded \(2^{1-1} + 2^{4-1} = 9\). These get reorganized after calls to calc.genoprob, sim.geno, or argmax.geno, and this approach simplifies the hidden Markov model (HMM) code.

For the 4- and 8-way RIL, genotyping errors are simulated only if the founder genotypes are 0/1 SNPs.

Author

Karl W Broman, broman@wisc.edu

Details

Meiosis is assumed to follow the Stahl model for crossover interference (see the references, below), of which the no interference model and the chi-square model are special cases. Chiasmata on the four-strand bundle are a superposition of chiasmata from two different mechanisms. With probability p, they arise by a mechanism exhibiting no interference; the remainder come from a chi-square model with inteference parameter m. Note that m=0 corresponds to no interference, and with p=0, one gets a pure chi-square model.

If a chromosomes has class X, it is assumed to be the X chromosome, and is assumed to be segregating in the cross. Thus, in an intercross, it is segregating like a backcross chromosome. In a 4-way cross, a second phenotype, sex, will be generated.

QTLs are assumed to act additively, and the residual phenotypic variation is assumed to be normally distributed with variance 1.

For a backcross, the effect of a QTL is a single number corresponding to the difference between the homozygote and the heterozygote.

For an intercross, the effect of a QTL is a pair of numbers, (\(a,d\)), where \(a\) is the additive effect (half the difference between the homozygotes) and \(d\) is the dominance deviation (the difference between the heterozygote and the midpoint between the homozygotes).

For a four-way cross, the effect of a QTL is a set of three numbers, (\(a,b,c\)), where, in the case of one QTL, the mean phenotype, conditional on the QTL genotyping being AC, BC, AD or BD, is \(a\), \(b\), \(c\) or 0, respectively.

References

Copenhaver, G. P., Housworth, E. A. and Stahl, F. W. (2002) Crossover interference in arabidopsis. Genetics 160, 1631--1639.

Foss, E., Lande, R., Stahl, F. W. and Steinberg, C. M. (1993) Chiasma interference as a function of genetic distance. Genetics 133, 681--691.

Zhao, H., Speed, T. P. and McPeek, M. S. (1995) Statistical analysis of crossover interference using the chi-square model. Genetics 139, 1045--1056.

Broman, K. W. (2005) The genomes of recombinant inbred lines Genetics 169, 1133--1146.

Teuscher, F. and Broman, K. W. (2007) Haplotype probabilities for multiple-strain recombinant inbred lines. Genetics 175, 1267--1274.

See Also

sim.map, read.cross, fake.f2, fake.bc fake.4way, simFounderSnps

Examples

Run this code
# simulate a genetic map
map <- sim.map()


### simulate 250 intercross individuals with 2 QTLs
fake <- sim.cross(map, type="f2", n.ind=250,
                  model = rbind(c(1,45,1,1),c(5,20,0.5,-0.5)))


### simulate 100 backcross individuals with 3 QTL
# a 10-cM map model after the mouse
data(map10)

fakebc <- sim.cross(map10, type="bc", n.ind=100,
                    model=rbind(c(1,45,1), c(5,20,1), c(5,50,1)))


### simulate 8-way RIL by sibling mating
# get lengths from the above 10-cM map
L <- ceiling(sapply(map10, max))

# simulate a 1 cM map
themap <- sim.map(L, n.mar=L+1, eq.spacing=TRUE)

# simulate founder genotypes
pg <- simFounderSnps(themap, "8")

# simulate the 8-way RIL by sib mating (256 lines)
ril <- sim.cross(themap, n.ind=256, type="ri8sib", founderGeno=pg)

Run the code above in your browser using DataLab