simulate_mk_model: Simulate an Mk model for discrete trait evolution.

Description

Given a rooted phylogenetic tree, a fixed-rates continuous-time Markov model for the evolution of a discrete trait ("Mk model", described by a transition matrix) and a probability vector for the root, simulate random outcomes of the model on all nodes and/or tips of the tree. The function traverses nodes from root to tips and randomly assigns a state to each node or tip based on its parent's previously assigned state and the specified transition rates between states. The generated states have joint distributions consistent with the Markov model. Optionally, multiple independent simulations can be performed using the same model.

Usage

simulate_mk_model(tree, Q, root_probabilities="stationary",
                  include_tips=TRUE, include_nodes=TRUE, 
                  Nsimulations=1, drop_dims=TRUE)

Arguments

tree

A rooted tree of class "phylo". The root is assumed to be the unique node with no incoming edge.

A numeric matrix of size Nstates x Nstates, storing the transition rates between states. In particular, every row must sum up to zero.

root_probabilities

Probabilities of the different states at the root. Either a character vector with value "stationary" or "flat", or a numeric vector of length Nstates, where Nstates is the number of possible states of the trait. In the later case, root_probabilities must be a valid probability vector, i.e. with non-negative values summing up to 1. "stationary" sets the probabilities at the root to the stationary distribution of Q (see get_stationary_distribution), while "flat" means that each state is equally probable at the root.

include_tips

Include random states for the tips. If FALSE, no states will be returned for tips.

include_nodes

Include random states for the nodes. If FALSE, no states will be returned for nodes.

Nsimulations

Number of random independent simulations to perform. For each node and/or tip, there will be Nsimulations random states generated.

drop_dims

Logical, specifying whether the returned tip_states and node_states (see below) should be vectors, if Nsimulations==1. If drop_dims==FALSE, then tip_states and tip_nodes will always be 2D matrices.

Value

A list with the following elements:

tip_states

Either NULL (if include_tips==FALSE), or a 2D integer matrix of size Nsimulations x Ntips with values in 1,..,Nstates, where Ntips is the number of tips in the tree and Nstates is the number of possible states of the trait. The [r,c]-th entry of this matrix will be the state of tip c generated by the r-th simulation. If drop_dims==TRUE and Nsimulations==1, then tip_states will be a vector.

node_states

Either NULL (if include_nodes==FALSE), or a 2D integer matrix of size Nsimulations x Nnodes with values in 1,..,Nstates, where Nnodes is the number of nodes in the tree. The [r,c]-th entry of this matrix will be the state of node c generated by the r-th simulation. If drop_dims==TRUE and Nsimulations==1, then node_states will be a vector.

Details

For this function, the trait's states must be represented by integers within 1,..,Nstates, where Nstates is the total number of possible states. If the states are originally in some other format (e.g. characters or factors), you should map them to a set of integers 1,..,Nstates. These integers should correspond to row & column indices in the transition matrix Q. You can easily map any set of discrete states to integers using the function map_to_state_space.

If tree$edge.length is missing, each edge in the tree is assumed to have length 1. The tree may include multi-furcations (i.e. nodes with more than 2 children) as well as mono-furcations (i.e. nodes with only one child). The time required per simulation decreases with the total number of requested simulations.

Examples

Run this code

# NOT RUN {
# generate a random tree
tree = generate_random_tree(list(birth_rate_intercept=1),max_tips=1000)$tree

# simulate discrete trait evolution on the tree (5 states)
Nstates = 5
Q = get_random_mk_transition_matrix(Nstates, rate_model="ARD", max_rate=0.1)
tip_states = simulate_mk_model(tree, Q)$tip_states

# plot histogram of simulated tip states
barplot(table(tip_states)/length(tip_states), xlab="state")
# }

Run the code above in your browser using DataLab