genGrowTree
- Generates a rooted binary tree in phylo
format with the given number of n
leaves under a specified
discrete-time tree growing model without extinction.
These tree growing models act at the leaves by varying their speciation
rates according to a parameter ZETA
or variance SIGMA
. They
may also depend on so-called trait values of the leaves (e.g., continuous or
discrete age, or another numeric trait that affects fitness).
You may choose an already built-in model (see use_built_in
) or
specify a (new) model by defining how the rates (and optionally traits)
change in every time step (see parameters childRates
and
otherRates
as well as childTraits
and
otherTraits
; see also Table 5 of the supplementary material of
the corresponding manuscript).
genGrowTree(
n,
STARTING_RATE = 1,
STARTING_TRAIT = 10,
ZETA = 1,
SIGMA = 0,
childRates,
otherRates,
childTraits = NULL,
otherTraits = NULL,
use_built_in = NULL
)
genGrowTree
A single tree of class phylo
is
returned.
Integer value that specifies the desired number of leaves, i.e.,
vertices with in-degree 1 and out-degree 0.
Due to the restrictions of the phylo
or multiphylo
format,
the number of leaves must be at least 2 since there must be at
least one edge.
Positive numeric value (default = 1) which specifies the initial rate at which the speciation events occur (has only influence on the edge length, not on the tree topology).
Numeric value (default = 10) which specifies the initial state of a trait.
Constant non-negative numeric value (default = 1) which can
influence the speciation rates. Can also be a vector if used as such when
defining the functions childRates
, otherRates
,
childTraits
, and otherTraits
.
Constant positive numeric value (default = 0) which can influence
the speciation rates. Can also be a vector if used as such when defining the
functions childRates
, otherRates
, childTraits
, and
otherTraits
.
A function that generates two speciation rates for the
children emerging from a speciation event based on various factors.
Necessary if use_built_in
is not specified.
childTraits
works similarly but is executed before
childRates
.
All available parameters are:
the starting rate sr
,
the starting trait value st
,
the parent's rate pr
,
the parent's trait value pt
,
the children's trait values ct
(vector ct[1]
and
ct[2]
),
the parameters zeta ze
and sigma si
.
All parameters have to appear in the function definition but not
necessarily in the body of the function. Trait values are NA, if
childTraits
and otherTraits
is not given.
Example:
function (sr, st, pr, pt, ct, ze, si) return(c(pr*ze,
pr*(1-ze)))
for biased speciation.
A function that generates a new speciation rate for all
leaves not affected by the speciation event (all but parent and children)
based on various factors. The function is applied after the speciation event,
i.e., after childRates/Traits
.
Necessary if use_built_in
is not specified.
otherTraits
works similarly.
All available parameters are:
the starting rate sr
,
the starting trait value st
,
the leaf's old rate or
,
the leaf's old trait value ot
,
the parameters zeta ze
and sigma si
.
All parameters have to appear in the function definition but not
necessarily in the body of the function. Trait values are NA, if
childTraits
and otherTraits
is not given.
Example:
function (sr, st, or, ot, ze, si) return(or*ze)
for
age-step-based fertility.
An optional function (default = NULL) that generates two
trait values for the children emerging from a speciation event based on
various factors.
See childRates
for available parameters (except ct
) and
explanations. Not necessary; is only applied if not NULL.
Example:
function (sr, st, pr, pt, ze, si)
return(c(0, 0))
for age.
An optional function (default = NULL) that generates a new
trait value for all leaves not affected by the speciation event (all but
parent and children) based on various factors.
See otherRates
for available parameters and explanations.
Not necessary; is only applied if not NULL.
Example:
function (sr, st, or, ot, ze, si) return(ot+1)
for discrete age (age in time steps).
Optional (default = NULL): Character specifying which of
the already implemented models should be used. Overwrites childRates
,
otherRates
, childTraits
, and otherTraits
.
Here is a list of available models with their (abbreviated) underlying
functions given in parentheses (in order childRates
,
otherRates
; then childTraits
and otherTraits
if necessary):
"DCO_sym": Symmetric direct-children-only, ZETA
>0
(c(sr ze, sr ze), sr)
"DCO_asym": Asymmetric direct-children-only, ZETA
>0
(c(sz, pr), sr)
"IF_sym": Symmetric inherited fertility, ZETA
>0
(c(pr ze, pr ze), or)
"IF_asym": Asymmetric inherited fertility, ZETA
>0
(c(pr ze, pr), or)
"IF-diff": Unequal fertility inheritance, ZETA
>=1
(c(2 pr ze / (ze+1), 2 pr / (ze+1)), or)
"biased": Biased speciation, ZETA
>=0 and <=1
(c(pr ze, pr (1-ze)), or)
"ASB": Age-step-based fertility, ZETA
>0
(c(sr, sr), or ze)
"simpleBrown_sym": Symmetric simple Brownian, SIGMA
> =0
(c(max{pr+ rnorm(1, mean=0, sd=si),1e-100},
max{pr+ rnorm(1, mean=0, sd=si),1e-100}), or)
"simpleBrown_asym": Asymmetric simple Brownian, SIGMA
>=0
(c(max{pr+ rnorm(1, mean=0, sd=si),1e-100}, pr), or)
"lin-Brown_sym": Sym. punctuated(-intermittent) linear-Brownian,
SIGMA
vector with two values >=0
(c(10^(log(ct[1])+ rnorm(1, mean=0, sd=si[1])),
10^(log(ct[2])+ rnorm(1, mean=0, sd=si[1]))), or;
c(max{pt + rnorm(1, mean=0, sd=si[2]),1e-100},
max{pt + rnorm(1, mean=0, sd=si[2]),1e-100}), ot)
"lin-Brown_asym": Asym. punctuated(-intermittent) linear-Brownian,
SIGMA
vector with two values >=0
(c(10^(log(ct[1])+ rnorm(1, mean=0, sd=si[1])), pr), or;
c(max{pt + rnorm(1, mean=0, sd=si[2]),1e-100}, pt, ot)
"lin-Brown-bounded_sym": Bounded sym. punctuated(-intermittent)
linear-Brownian, SIGMA
vector with two values >=0,
STARTING_TRAIT
is automatically set to 10
(c(10^(log(ct[1])+ rnorm(1, mean=0, sd=si[1])),
10^(log(ct[2])+ rnorm(1, mean=0, sd=si[1]))), or;
c(min{max{pt + rnorm(1, mean=0, sd=si[2]),1e-100},20},
min{max{pt + rnorm(1, mean=0, sd=si[2]),1e-100},20}), ot)
"lin-Brown-bounded_asym": Bounded asym. punctuated(-intermittent)
linear-Brownian, SIGMA
vector with two values >=0
(c(10^(log(ct[1])+ rnorm(1, mean=0, sd=si[1])), pr), or;
c(min{max{pt + rnorm(1, mean=0, sd=si[2]),1e-100},20}, pt), ot)
"log-Brown_sym": Sym. punctuated(-intermittent) log-Brownian,
SIGMA
vector with two values >=0
(c(10^(log(ct[1])+ rnorm(1, mean=0, sd=si[1])),
10^(log(ct[2])+ rnorm(1, mean=0, sd=si[1]))), or;
c(10^(log(pt)+ rnorm(1, mean=0, sd=si[2])),
10^(log(pt)+ rnorm(1, mean=0, sd=si[2]))), ot)
"log-Brown_asym": Asym. punctuated(-intermittent) log-Brownian,
SIGMA
vector with two values >=0
(c(10^(log(ct[1])+ rnorm(1, mean=0, sd=si[1])), pr), or;
10^(c(log(pt)+ rnorm(1, mean=0, sd=si[2])), pt), ot)
S. J. Kersting, K. Wicke, and M. Fischer. Tree balance in phylogenetic models. arXiv:2406.05185, 2024.
S. J. Kersting, K. Wicke, and M. Fischer. Tree balance in phylogenetic models: Supplementary material. https://tinyurl.com/278cwdh8, 2024.
M. G. B. Blum and O. Francois. On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited. Mathematical Biosciences, 195(2):141–153, 2005.
S. B. Heard. Patterns in phylogenetic tree balance with variable and evolving speciation rates. Evolution, 50(6):2141–2148, 1996.
S. J. Kersting. Genetic programming as a means for generating improved tree balance indices (Master’s thesis, University of Greifswald), 2020.
M. Kirkpatrick and M. Slatkin. Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution, 47(4):1171–1181, 1993.
genGrowTree(n = 5, use_built_in = "IF_sym", ZETA = 2)
Run the code above in your browser using DataLab