Given a rooted phylogenetic tree, a fixed-rates continuous-time Markov model for the evolution of a discrete trait ("Mk model", described by a transition matrix) and a probability vector for the root, simulate random outcomes of the model on all nodes and/or tips of the tree. The function traverses nodes from root to tips and randomly assigns a state to each node or tip based on its parent's previously assigned state and the specified transition rates between states. The generated states have joint distributions consistent with the Markov model. Optionally, multiple independent simulations can be performed using the same model.
simulate_mk_model(tree, Q, root_probabilities="stationary",
include_tips=TRUE, include_nodes=TRUE,
Nsimulations=1, drop_dims=TRUE)
A rooted tree of class "phylo". The root is assumed to be the unique node with no incoming edge.
A numeric matrix of size Nstates x Nstates, storing the transition rates between states. In particular, every row must sum up to zero.
Probabilities of the different states at the root.
Either a character vector with value "stationary" or "flat", or a numeric vector of length Nstates, where Nstates is the number of possible states of the trait. In the later case, root_probabilities
must be a valid probability vector, i.e. with non-negative values summing up to 1. "stationary" sets the probabilities at the root to the stationary distribution of Q
(see get_stationary_distribution
), while "flat" means that each state is equally probable at the root.
Include random states for the tips. If FALSE
, no states will be returned for tips.
Include random states for the nodes. If FALSE
, no states will be returned for nodes.
Number of random independent simulations to perform. For each node and/or tip, there will be Nsimulations
random states generated.
Logical, specifying whether the returned tip_states
and node_states
(see below) should be vectors, if Nsimulations==1
. If drop_dims==FALSE
, then tip_states
and tip_nodes
will always be 2D matrices.
A list with the following elements:
Either NULL
(if include_tips==FALSE
), or a 2D integer matrix of size Nsimulations x Ntips with values in 1,..,Nstates, where Ntips is the number of tips in the tree and Nstates is the number of possible states of the trait. The [r,c]-th entry of this matrix will be the state of tip c generated by the r-th simulation. If drop_dims==TRUE
and Nsimulations==1
, then tip_states
will be a vector.
Either NULL
(if include_nodes==FALSE
), or a 2D integer matrix of size Nsimulations x Nnodes with values in 1,..,Nstates, where Nnodes is the number of nodes in the tree. The [r,c]-th entry of this matrix will be the state of node c generated by the r-th simulation. If drop_dims==TRUE
and Nsimulations==1
, then node_states
will be a vector.
For this function, the trait's states must be represented by integers within 1,..,Nstates, where Nstates is the total number of possible states. If the states are originally in some other format (e.g. characters or factors), you should map them to a set of integers 1,..,Nstates. These integers should correspond to row & column indices in the transition matrix Q
. You can easily map any set of discrete states to integers using the function map_to_state_space
.
If tree$edge.length
is missing, each edge in the tree is assumed to have length 1. The tree may include multi-furcations (i.e. nodes with more than 2 children) as well as mono-furcations (i.e. nodes with only one child). The time required per simulation decreases with the total number of requested simulations.
exponentiate_matrix
, get_stationary_distribution
,
simulate_bm_model
, simulate_ou_model
, simulate_rou_model
# NOT RUN {
# generate a random tree
tree = generate_random_tree(list(birth_rate_intercept=1),max_tips=1000)$tree
# simulate discrete trait evolution on the tree (5 states)
Nstates = 5
Q = get_random_mk_transition_matrix(Nstates, rate_model="ARD", max_rate=0.1)
tip_states = simulate_mk_model(tree, Q)$tip_states
# plot histogram of simulated tip states
barplot(table(tip_states)/length(tip_states), xlab="state")
# }
Run the code above in your browser using DataLab