- zoomodel
A valid zmodel object as defined by the zoomodel function. The
model indicates whether rates of exponential distributions are estimated or
predefined, the number of classes, the starting values for mixing
coefficients and rates, the error probabilities. See "zoomodel" for more
details.
- zooin
A valid zdata object as obtained by the zoodata function. See
"zoodata" for more details.
- ids
An optional argument indicating the individual (its position in the
data file) that must be proceeded. It can also be a vector containing the
list of numbers that must be proceeded. By default, the model runs for all
individuals.
- parameters
Specifies whether the parameters are estimated by
optimization with the L-BFGS-B method from the optim function (optional
argument - true by default). If the user doesn't want to estimate the
parameters he must set parameters=FALSE. In that case, the forward-backaward
and Viterbi algorithms are run with the provided parameters.
- fb
A logical indicating whether the forward-backward algorithm is run
(optional argument - true by default). The Forward-Backward algorithm
estimates the local probabilities to belong to each HBD or non-HBD class. By
default, the function returns only the HBD probabilities for each class,
averaged genome-wide, and corresponding to the realized autozygosity
associated with each class. To obtain HBD probabilities at every marker
position, the option localhbd must be set to true (this generates larger
outputs).
- vit
A logical indicating whether the Viterbi algorithm is run (optional
argument - false by default). The Viterbi algorithm performs the decoding
(determining the underlying class at every marker position). Whereas the
Forward-Backward algorithms provide HBD probabilities (and how confident a
region can be declared HBD), the Viterbi algorithm assigns every marker
position to one of the defined classes (HBD or non-HBD). When informativity
is high (many SNPs per HBD segments), results from the Forward-Backward and
the Viterbi algorithm are very similar. The Viterbi algorithm is best suited
to identify HBD segments. To estimate realized inbreeding and determine HBD
status of a position, we recommend to use the Forward-Backward algorithm
that better reflects uncertainty.
- localhbd
A logical indicating whether the HBD probabilities for each
individual at each marker are returned when using the Forward-Backward
algorithm (fb option). This is an optional argument that is false by
default.
- nT
Indicates the number of threads used when running RZooRoH in
parallel (optional argument - one thread by default).
- optim_method
Indicates which method the optim R function will use to
estimate the parameters of the model ("L-BFGS-B" by default). The possible
methods are "Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN" and "Brent".
Type "? optim" to have more information. In our experience, the "L-BFGS-B"
method works well but the method achieving the best likelihood is variable
(according to the data sets, the model, the priors, the constraints). The
same goes for the efficiency (speed). When the zoorun does not converge, you
can test with another method. Note that the only method allowing to put
constraints on parameters is "L-BFGS-B" (other methods are unconstrained).
- maxiter
Indicates the maximum number of iterations when estimating the
parameters with the R optim function (optional argument - 100 by default).
Iterations are not defined identically across methods. For instance, in one
iteration of the "L-BFGS-B" method, the likelihood of the model, estimated
with the forward algorithm, is evaluated multiple times. So, a value of 100
iterations is good for the "L-BFGS-B" method but larger values are required
for some other algorithms.
- minmix
This indicates the minimal value for the mixing coefficients. By
default it is set to 0 with the classical mixkl and kl models
(unconstrained). However, when using the step option or the "Interval"
HBDclass, the values is set to 1e-16 to avoid numerical problems. Note that
constraints are only allowed with the "L-BFGS-B" method from optim.
- maxr
This indicates the maximum difference between rates of successive
classes. It is an optional argument set to an arbitrarily large value
(100000000). Adding such constraints might slow down the speed of
convergence and we recommend to run first without this constraint
(constraints are only allowed with the "L-BFGS-B" method from optim).
- ibd
A logical indicating whether the function will be used to compute
IBD between pairs of phased haplotypes instead of HBD within individuals. In
that case, the user must provide a matrix with the pairs of haplotypes that
will be analyzed. This option can only be used if phased data are provided
as input with the zformat set to "vcf" or "haps". This is an optional
parameter set to false by default.
- ibdpairs
A matrix with four columns, indicating the pair of haplotypes
being analyzed in an IBD analysis. Haplotypes are indicated by two columns,
one column for the id of the individuals and a second column for the
haplotype number within individual (1 or 2). The first and third columns
indicate the id of the individuals carrying the first and second haplotype,
respectively. The second and four columns indicates the haplotype numbers
within the first and second individuals, respectively. With haploid data,
the matrix must have only two columns indicating the number of the first and
second haplotypes from the pairs. This is an optional parameter, the matrix
must be provided only when the ibd option is true.
- haploid
This is an optional parameter indicating whether haplotypes
belong to an haploid organisms or chromosome (false by default). It can be
used only in combination with the 'ibd' option and requires phased data as
input with the zformat set to "haps". When haploid is true, then the
ibdpairs matrix has only two columns indicating simply the haplotype
numbers. When haploid is true the number of haplotypes can be uneven while
even numbers are required when haploid is set to false.
- RecTable
This is an optional parameter indicating whether a finite
number of genetic distances are used (false by default). This function can
be used only with the "Interval" HBDclass. The "Interval" option can be
slow, in particular if large "intervals" of generations are defined. To
speed up computations, some variables are precomputed for a finite set of
genetic distances, select to cover a broad range of possible values. The
real genetic distance between two genetic markers is then replaced by the
closest value in the table (the difference between the true and used genetic
distances being also lower than 10"%").
- trim_ad
This is an option still under evaluation (for testing only)
- hemiprob
This is an option still under evaluation (for testing only)