Usage
PhyloEM(phylo, Y_data, process = c("BM", "OU", "scOU", "rBM"), check_postorder = TRUE, independent = FALSE, K_max = max(floor(sqrt(length(phylo$tip.label))), 10), use_previous = FALSE, order = TRUE, method.selection = c("LINselect", "DDSE", "Djump"), C.BM1 = 0.1, C.BM2 = 2.5, C.LINselect = 1.1, method.variance = c("upward_downward", "simple"), method.init = "lasso", method.init.alpha = "estimation", method.init.alpha.estimation = c("regression", "regression.MM", "median"), methods.segmentation = c("lasso", "best_single_move"), alpha_grid = TRUE, nbr_alpha = 10, random.root = TRUE, stationary.root = TRUE, alpha = NULL, check.tips.names = TRUE, progress.bar = TRUE, estimates = NULL, save_step = FALSE, parallel_alpha = FALSE, Ncores = 3, impute_init_Rphylopars = FALSE, K_lag_init = 5, light_result = TRUE, ...)
Arguments
phylo
A phylogenetic tree of class phylo
(from package ape
). Y_data
Matrix of data at the tips, size p x ntaxa. Each line is a
trait, and each column is a tip. The column names are checked against the
tip names of the tree.
process
The model used for the fit. One of "BM" (for a full BM model,
univariate or multivariate); "OU" (for an OU with independent traits,
univariate or multivariate); or "scOU" (for a "scalar OU" model, see details).
check_postorder
Re-order the tree in post-order. If the Upward-Downward
algorithm is used, the tree need to be in post-order. Default to TRUE if the
upward-downward is used, otherwise automatically set to FALSE.
independent
Are the trait assumed to be independent from one another?
Default to FALSE. OU in a multivariate setting only works if TRUE.
K_max
The maximum number of shifts to be considered. Default to
$max(|\sqrt ntaxa|, 10)$.
use_previous
Should the initialization for K+1 shifts use the
estimation for $K$ shifts already obtained? Default to FALSE.
order
Should the estimations be done for K increasing (TRUE) or K
decreasing (FALSE)? If use_previous=FALSE, this has no influence, exept if one
initialization fails. Default to TRUE.
method.selection
Method selection to be used. Several ones can be
used at the same time. One of "LINselect" for the Baraud Giraud Huet LINselect
method; "DDSE" for the Slope Heuristic or "Djump" for the Jump Heuristic, last
two based the Birgé Massart method.
C.BM1
Multiplying constant to be used for the BigeMassart1 method.
Need to be positive. Default to 0.1.
C.BM2
Multiplying constant to be used for the BigeMassart2 method.
Default to 2.5.
C.LINselect
Multiplying constant to be used for the LINselect method.
Need to be greater than 1. Default to 1.1.
method.variance
Algorithm to be used for the moments computations at the
E step. One of "simple" for the naive method; of "upward_downward" for the
Upward Downward method (usually faster). Default to "upward_downward".
method.init
The initialization method. One of "lasso" for the LASSO
base initialization method; or "default" for user-specified initialization
values. Default to "lasso".
method.init.alpha
For OU model, initialisation method for the selection
strength alpha. One of "estimation" for a cherry-based initialization, using
nlrob
; or "default" for user-specified
initialization values. Default to "estimation". method.init.alpha.estimation
If method.init.alpha="estimation",
choice of the estimation(s) methods to be used. Choices among "regression",
(method="M" is passed to nlrob
); "regression.MM"
(method="MM" is passed to nlrob
) or "median"
(nlrob
is not used, a simple median is taken).
Default to all of them. methods.segmentation
For OU, method(s) used at the M step to find new
candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm;
and "best_single_move" for a one-move at a time based heuristic. Default to
both of them. Using only "lasso" might speed up the function a lot.
alpha_grid
Wether to use a grid for alpha values. Default to TRUE. This
is the only available method for scOU. This method is not available for OU with
multivariate traits. OU with univariate traits can take both TRUE or FALSE. If
TRUE, a grid based on the branch length of the tree is automatically computed,
using function find_grid_alpha
. nbr_alpha
If alpha_grid=TRUE
, the number of alpha values on the
grid. Default to 10.
random.root
Wether the root is assumed to be random (TRUE) of fixed
(FALSE). Default to TRUE
stationary.root
Wether the root is assumed to be in the stationnary
state. Default to TRUE.
alpha
If the estimation is done with a fixed alpha (either known, or
on a grid), the possible value for alpha. Default to NULL.
check.tips.names
Wether to check the tips names of the tree against
the column names of the data. Default to TRUE.
progress.bar
Wether to display a progress bar of the computations.
Default to TRUE.
estimates
The result of a previous run of this same function. This
function can be re-run for other model election method. Default to NULL.
save_step
If alpha_grid=TRUE, wether to save the intermediate results
for each value of alpha (in a temporary file). Useful for long computations.
Default to FALSE.
parallel_alpha
If alpha_grid=TRUE, wether to run the
estimations with different values of alpha on separate cores. Default to
FALSE. If TRUE, the log is written as a temporary file.
Ncores
If parallel_alpha=TRUE, number of cores to be used.
K_lag_init
Number of extra shifts to be considered at the initialization
step. Increases the accuracy, but can make computations quite slow of taken
too high. Default to 5.
light_result
if TRUE (the default), the object returned is enlighted,
without easilly computatble quantities. If FALSE, the object can be very heavy, but
its subsequent manipulations can be faster (especially for plotting).
...
Further arguments to be passed to estimateEM
, including
tolerance parameters for stopping criterions, maximal number of iterations, etc.