powerComp
- Compare the power of a set of TSS to identify trees
generated under different alternative models given a null model.
powerComp_RegAcc
- Compare the power of a set of TSS to identify trees
generated under different alternative models given a the region(s) of
acceptance.
powerComp(
tss,
null_model = "yule",
alt_models,
n,
distribs = "exact_if_possible",
N_null = 10000L,
N_alt = 1000L,
test_type = "two-tailed",
correction = "small-sample",
sig_lvl = 0.05
)powerComp_RegAcc(
tss,
accept_regions,
null_model,
alt_models,
n,
distribs = "exact_if_possible",
N_null = 10000L,
N_alt = 1000L,
test_type = "two-tailed",
correction = "small-sample",
sig_lvl = 0.05
)
powerComp
Returns an object of class 'poweRbal_data' which
is a list containing the following objects:
power: Numeric matrix containing the power values (one row per TSS and one
column per alternative model).
accept_regions: Numeric matrix containing the information on the region of
acceptance (one row per TSS and four columns).
CIradius: Numeric matrix containing the confidence interval radii (one row
per TSS and one column per alternative model).
actual_sample_sizes: Numeric vector containing the actual sample sizes
under each alternative model as some models do not always successfully
generate trees.
other input data.
powerComp_RegAcc
Returns an object of class 'poweRbal_data'
similar to powerComp
.
Vector containing the names (as character) of the tree shape
statistics that should be compared. You may either use the short names
provided in tssInfo
to use the already included TSS, or use the
name of a list object containing similar information as the entries in
tssInfo
. Example:
Use "new_tss"
as the name for the list object
new_tss
containing at least the function
new_tss$func = function(tree){...}
,
and optionally also the information new_tss$short
,
new_tss$simple
, new_tss$name
, new_tss$type
,
new_tss$only_binary
, and new_tss$safe_n
.
The null model that is to be used to determine the power
of the tree shape statistics. In general, it must be a function that
produces rooted binary trees in phylo
format.
If the respective model is included in this
package, then specify the model and its parameters by using a character
or list. Available are all options listed under parameter tm
in
the documentation of function genTrees
(type ?genTrees
).
If you want to include your own tree model, then use the
name of a list object containing the function (dependent on one parameter
n
). Example:
Use "new_tm"
for the list object
new_tm <- list(func = function(n, Ntrees){...})
.
List containing the alternative models that are to be
used to determine the power of the tree shape statistics. Functions that
produce rooted binary trees in phylo
format. The information of each
single model must be in the format described for null_model
.
Integer value that specifies the desired number of leaves, i.e., vertices with in-degree 1 and out-degree 0.
Determines how the distributions (and with that the
bounds of the critical region) are computed. Available are:
"exact_if_possible" (default): Tries to compute the exact distribution
under the null model if possible. Currently, this is only implemented for
null_model = "yule"
, "pda"
, or "etm"
, and
n
<=20. In all other cases the distribution is approximated
by sampling N_null
many trees under the null model as in the
option "sampled" below.
"sampled": N_null
many trees are sampled under the
null model to approximate the distribution.
Sample size (integer >=10) if distributions are sampled (default = 10000L).
Sample size (integer >=10) for the alternative models to estimate the power (default = 1000L).
Determines the method. Available are:
"two-tailed" (default): The lower and upper bound of the region of
acceptance are determined based on the (empirical) distribution function
such that P(TSS < lower bound) <= sig_lvl
/2 and
P(TSS > upper bound) <= sig_lvl
/2. See parameter correction
for specifying how conservative the test should be: the null
hypothesis can either be rejected only if the values are strictly outside of
this region of acceptance (can be too conservative) or it can also be
rejected (with certain probabilities) if the value equals the lower or
upper bound.
"two-tailed-unbiased": Experimental - Use with caution!
The region of acceptance is optimized to yield an unbiased test, i.e., a test
that identifies non-null models with a probability of at least
sig_lvl
.
The region of acceptance is determined similar to the default method.
However, it need not be symmetrical, i.e., not necessarily
cutting off sig_lvl
/2 on both sides. Also see parameter
correction
for specifying how conservative the test should be.
Specifies the desired correction method.
Available are:
"small-sample" (default): This method tries to ensure that the critical
region, i.e. the range of values for which the null hypothesis is rejected,
is as close to sig_lvl
as possible (compared with "none" below, which
can be too conservative). The idea is that the null hypothesis is also
rejected with certain probabilities if the value matches the value of a
quantile.
"none": No correction method is applied. With that the test might be slightly too conservative as the null hypothesis is maintained if the values is >= the lower and <= the upper quantile.
Level of significance (default = 0.05, must be >0 and <1).
Numeric matrix (one row per TSS) with two or four
columns: The first two columns contain the interval limits of the region
of acceptance, i.e., we reject the null hypothesis for values strictly
outside of this interval. The third and fourth columns contain the
probabilities to reject the null hypothesis if values equal the lower or
upper bound, respectively. If the last two columns are missing they are
interpreted as zeroes. See return value of getAccRegion()
.
S. J. Kersting, K. Wicke, and M. Fischer. Tree balance in phylogenetic models. arXiv:2406.05185, 2024.
powerComp(tss = c("Sackin", "Colless", "B1I"),
alt_models = list(list("aldous",-1), "pda", "etm"), n = 10L,
distribs = "sampled", N_null = 40L, N_alt = 20L)
powerComp_RegAcc(tss = c("Sackin", "Colless", "B1I"),
accept_regions = getAccRegion(tss = c("Sackin", "Colless", "B1I"),
n = 6L, null_model = "etm",
N_null = 20L, distribs = "sampled"),
null_model = "etm", distribs = "sampled",
alt_models = list(list("aldous",-1), "pda", "yule"), n = 6L,
N_null = 20L, N_alt = 20L)
Run the code above in your browser using DataLab