getAccRegion
- Computes the region of acceptance based on quantiles
for a specified level of significance and method.
getAccRegion_sampled
- Computes a sampling-based region of acceptance
for the given null model based on quantiles for a specified level of
significance and method.
getAccRegion_exact
- Computes the exact region of acceptance for the
given null model based on quantiles for a specified level of significance
and method. Currently, this is only implemented for
null_model = "yule"
or "pda"
, and n
<=20.
computeAccRegion
- Computes the bounds of the region of acceptance
given the empirical distribution function (specified by the unique values
and their probabilities under the null model) for specified cut-offs
(e.g., 0.025 on both sides for a symmetric two-tailed test).
For values strictly outside of the interval the null hypothesis is
rejected.
This function also computes the probabilities to
reject the null hypothesis if the value equals the lower or upper bound of
the region of acceptance. This probability is 0 for correction method
"none" and for "small-sample" it ensures that the probability of rejection
exactly corresponds with the specified cut-offs.
getAccRegion(
tss,
null_model = "yule",
n,
distribs = "exact_if_possible",
N_null = 10000L,
N_alt = 1000L,
N_intervals = 1000L,
test_type = "two-tailed",
correction = "small-sample",
sig_lvl = 0.05
)getAccRegion_sampled(
tss,
null_model = "yule",
n,
N_null,
N_alt = 1000L,
N_intervals = 1000L,
test_type = "two-tailed",
correction = "small-sample",
sig_lvl = 0.05
)
getAccRegion_exact(
tss,
null_model = "yule",
n,
N_alt = 1000L,
N_intervals = 1000L,
test_type = "two-tailed",
correction = "small-sample",
sig_lvl = 0.05
)
computeAccRegion(
unique_null_vals,
unique_null_probs,
correction,
cutoff_left,
cutoff_right
)
getAccRegion
Numeric matrix (one row per TSS) with four
columns: The first two columns contain the interval limits of the region
of acceptance, i.e., we reject the null hypothesis for values strictly
outside of this interval. The third and fourth columns contain the
probabilities to reject the null hypothesis if values equal the lower or
upper bound, respectively.
getAccRegion_sampled
Numeric matrix (one row per TSS) with
four columns - similar as getAccRegion
.
getAccRegion_exact
Numeric matrix (one row per TSS) with
four columns - similar as getAccRegion
.
computeAccRegion
Numeric vector with
four columns - similar as getAccRegion
.
Vector containing the names (as character) of the tree shape
statistics that should be compared. You may either use the short names
provided in tssInfo
to use the already included TSS, or use the
name of a list object containing similar information as the entries in
tssInfo
. Example:
Use "new_tss"
as the name for the list object
new_tss
containing at least the function
new_tss$func = function(tree){...}
,
and optionally also the information new_tss$short
,
new_tss$simple
, new_tss$name
, new_tss$type
,
new_tss$only_binary
, and new_tss$safe_n
.
The null model that is to be used to determine the power
of the tree shape statistics. In general, it must be a function that
produces rooted binary trees in phylo
format.
If the respective model is included in this
package, then specify the model and its parameters by using a character
or list. Available are all options listed under parameter tm
in
the documentation of function genTrees
(type ?genTrees
).
If you want to include your own tree model, then use the
name of a list object containing the function (with the two input parameters
n
and Ntrees
). Example:
Use "new_tm"
for the list object
new_tm <- list(func = function(n, Ntrees){...})
.
Integer value that specifies the desired number of leaves, i.e., vertices with in-degree 1 and out-degree 0.
Determines how the distributions (and with that the
bounds of the critical region) are computed. Available are:
"exact_if_possible" (default): Tries to compute the exact distribution
under the null model if possible. Currently, this is only implemented for
null_model = "yule"
, "pda"
, or "etm"
, and
n
<=20. In all other cases the distribution is approximated
by sampling N_null
many trees under the null model as in the
option "sampled" below.
"sampled": N_null
many trees are sampled under the
null model to approximate the distribution.
Sample size (integer >=10) if distributions are sampled (default = 10000L).
Sample size (integer >=10) for the alternative models to
estimate the power (default = 1000L). Only needed here if the
test_type
is "two-tailed-unbiased".
Number (integer >=3, default = 1000L) of different
quantile/cut-off pairs investigated as potential bounds of the region of
acceptance. This parameter is only necessary if the test_type
is
"two-tailed-unbiased".
Determines the method. Available are:
"two-tailed" (default): The lower and upper bound of the region of
acceptance are determined based on the (empirical) distribution function
such that P(TSS < lower bound) <= sig_lvl
/2 and
P(TSS > upper bound) <= sig_lvl
/2. See parameter correction
for specifying how conservative the test should be: the null
hypothesis can either be rejected only if the values are strictly outside of
this region of acceptance (can be too conservative) or it can also be
rejected (with certain probabilities) if the value equals the lower or
upper bound.
"two-tailed-unbiased": Experimental - Use with caution!
The region of acceptance is optimized to yield an unbiased test, i.e., a test
that identifies non-null models with a probability of at least
sig_lvl
.
The region of acceptance is determined similar to the default method.
However, it need not be symmetrical, i.e., not necessarily
cutting off sig_lvl
/2 on both sides. Also see parameter
correction
for specifying how conservative the test should be.
Specifies the desired correction method.
Available are:
"small-sample" (default): This method tries to ensure that the critical
region, i.e., the range of values for which the null hypothesis is rejected,
is as close to sig_lvl
as possible (compared with "none" below, which
can be too conservative). The idea is that the null hypothesis is also
rejected with certain probabilities if the value matches a bound of the
region of acceptance.
"none": No correction method is applied. With that the test might be slightly too conservative as the null hypothesis is maintained if the values are >= the lower and <= the upper bound.
Level of significance (default=0.05, must be >0 and <1).
Numeric vector containing all the unique values under the null model.
Numeric vector containing the corresponding probabilities of the unique values under the null model.
Numeric value (>=0, <1) specifying the cut-off of the distribution for the lower bound of the region of acceptance. The sum of the two cut-offs must be <1.
Numeric value (>=0, <1) specifying the cut-off of the distribution for the upper bound of the region of acceptance. The sum of the two cut-offs must be <1.
getAccRegion(tss = c("Sackin", "Colless", "B1I"), n = 6L)
getAccRegion(tss = c("Sackin", "Colless", "B1I"), n = 6L, null_model = "etm",
N_null = 20L, correction = "none", distribs = "sampled")
getAccRegion(tss = c("Sackin", "Colless", "B1I"), n = 6L, N_null = 20L,
test_type = "two-tailed-unbiased", N_intervals = 5L,
N_alt = 10L)
getAccRegion_sampled(tss = c("Sackin", "Colless", "B1I"), n = 6L,
N_null = 20L, correction = "none")
getAccRegion_exact(tss = c("Sackin", "Colless", "B1I"),
null_model = "etm", n = 8L)
computeAccRegion(unique_null_vals = c(1,2,3,4,5),
unique_null_probs = c(0.1,0.4,0.1,0.2,0.2),
correction = "small-sample",
cutoff_left = 0.15, cutoff_right = 0.15)
Run the code above in your browser using DataLab