argmax.geno(cross, step=0, off.end=0, error.prob=0.0001,
map.function=c("haldane","kosambi","c-f","morgan"),
stepwidth=c("fixed", "variable", "max"))
cross
. See
read.cross
for details.step=0
, genotypes
are reconstructed only at the marker locations."fixed"
; "variable"
is included for the qtlbim
package ("max"<
cross
object is returned with a component,
argmax
, added to each component of cross$geno
.
The argmax
component is a matrix of size [n.ind x n.pos], where
n.pos is the
number of positions at which the reconstructed genotypes were obtained,
containing the most likely sequences of underlying genotypes.
Attributes "error.prob"
, "step"
, and "off.end"
are set to the values of the corresponding arguments, for later
reference.step
is small but
positive. One may observe quite different results for different values
of step
. The problem is that, in the presence of data like A----H
, the
sequences AAAAAA
and HHHHHH
may be more likely than any
one of the sequences AAAAAH
, AAAAHH
, AAAHHH
,
AAHHHH
, AHHHHH
, AAAAAH
. The Viterbi algorithm
produces a single "most likely" sequence of underlying genotypes.
This is done by calculating $\gamma_k(v_k) = \max_{v_1, \ldots, v_{k-1}} \Pr(g_1 = v_1, \ldots, g_k = v_k, O_1, \ldots, O_k)$ for $k = 1, \ldots, n$ and then tracing back through the sequence.
Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257--286.
sim.geno
, calc.genoprob
,
fill.geno
data(fake.f2)
fake.f2 <- subset(fake.f2,chr=18:19)fake.f2 <- argmax.geno(fake.f2, step=2, off.end=5, err=0.01)
Run the code above in your browser using DataLab