Learn R Programming

coala (version 0.4.0)

feat_mutation: Feature: Mutation

Description

This feature adds mutations to a model. Mutations occur in the genomes of the individuals with a given rate. The rate is per locus for unlinked loci and per trio for linked locus trios. By default, the same mutation rate is used for all loci, but it is possible to change this with par_variation and par_zero_inflation.

Usage

feat_mutation(rate, model = "IFS", base_frequencies = NA, tstv_ratio = NA, gtr_rates = NA, fixed_number = FALSE)

Arguments

rate
The mutation rate. Can be a numeric or a parameter. The rate is specified as $4 * N0 * mu$, where $mu$ is the mutation rate per locus.
model
The mutation model you want to use. Can be either 'IFS' (default), 'HKY' or 'GTR'. Refer to the mutation model section for detailed information.
base_frequencies
The equilibrium frequencies of the four bases used in the 'HKY' mutation model. Must be a numeric vector of length four, with the values for A, C, G and T, in that order.
tstv_ratio
The ratio of transitions to transversions used in the 'HKY' muation model.
gtr_rates
The rates for the six amino acid substitutions used in the 'GTR' model. Must be a numeric vector of length six. Order: A<->C, A<->G, A<->T, C<->G, C<->T, G<->T.
fixed_number
If set to TRUE, the number of mutations on each locus will always be exactly equal to the rate, rather than happening with a rate along the ancestral tree.

Value

The feature, which can be added to a model using `+`.

Mutation Models

The infinite sites mutation (IFS) model is a frequently used simplification in population genetics. It assumes that each locus consists of infinitely many sites at which mutations can occur, and each mutation hits a new site. Consequently, there are no back-mutations with this model. It does not generate DNA sequences, but rather only 0/1 coded data, were 0 denotes the ancestral state of the site, and 1 the derived state created by a mutation. The other mutation models are finite site models that generate more realistic sequences. The Hasegawa, Kishino and Yano (HKY) model (Hasegawa et al., 1985) allows for a different rate of transitions and transversions (tstv_ratio) and unequal frequencies of the four nucleotides (base_frequencies). The general reversible process (GTR) model (e.g. Yang, 1994) is more general than the HKY model and allows to define the rates for each type of substitution. The rates are assumed to be symmetric (e.g., the rate for T to G is equal to the one for G to T).

See Also

For using rates that variate between the loci in a model: par_variation, par_zero_inflation

For adding recombination: feat_recombination.

Other features: feat_growth, feat_ignore_singletons, feat_migration, feat_outgroup, feat_pop_merge, feat_recombination, feat_selection, feat_size_change, feat_unphased

Examples

Run this code
# A model with a constant mutation rate of 5:
model <- coal_model(5, 1) + feat_mutation(5) + sumstat_seg_sites()
simulate(model)

# A model with 7 mutations per locus:
model <- coal_model(5, 1) + feat_mutation(7, fixed = TRUE) + sumstat_seg_sites()
simulate(model)

# A model using the HKY model:
model <- coal_model(c(10, 1), 2) +
 feat_mutation(7.5, model = "HKY", tstv_ratio = 2,
               base_frequencies = c(.25, .25, .25, .25)) +
 feat_outgroup(2) +
 feat_pop_merge(1.0, 2, 1) +
 sumstat_seg_sites()
 ## Not run: simulate(model)

# A model using the GTR model:
model <- coal_model(c(10, 1), 1, 25) +
 feat_mutation(7.5, model = "GTR",
               gtr_rates = c(1, 1, 1, 1, 1, 1) / 6) +
 feat_outgroup(2) +
 feat_pop_merge(1.0, 2, 1) +
 sumstat_dna()
 ## Not run: simulate(model)$dna

Run the code above in your browser using DataLab