MrBayes is not great for getting samples of dated posterior phylogenies, or for obtaining certain summary trees from the posterior (specifically the MCCT and MAP, which are specific trees in the posterior). This is because the tree samples as returned are scaled relative to rate parameters in a separate file. This function attempts to automate the handling of multiple files (both .t tree files and .p parameter files), as well as multiple files associated with separate runs, to obtain samples of posterior trees, or summary trees such as the MCCT or MAP. These resulting trees are now scaled to units of time, but not be placed correctly on an absolute time-scale if all tips are extinct. See details of output below.
obtainDatedPosteriorTreesMrB(runFile, nRuns = 2, burnin = 0.5, outputTrees,
labelPostProb = FALSE, getFixedTimes = FALSE, getRootAges = FALSE,
originalNexusFile = NULL, file = NULL)
A filename in the current directory, or a path to a file that is either a .p
or .t file from a MrBayes analysis. This filename and path will be used for finding additional
.t and .p files, via the nRuns
settings and assuming that files are in the
same directory and these files are named under
typical MrBayes file naming conventions. (In other words, if you have renamed your .p or .t files,
this function probably won't be able to find them.)
The number of runs in your analysis. This variable is used for figuring out what filenames will be searched for: if you specify that you have less runs than you actually ran in reality, then some runs won't be examined in thi function. Conversely, specify too many, and this function will throw an error when it cannot find files it expects but do not exist. The default for this argument (two runs) is based on the default number of runs in MrBayes.
The fraction of trees sampled in the posterior discarded and not returned by this function directly, nor included in calculation of summary trees. Must be a numeric value greater than 0 and less than 1.
Determines the output trees produced; for format of output, see section
on returned Value below. Must be of length one, and either "all"
,
which means all trees from the post-burnin posterior will
returned, a number greater than zero, which will be the number of trees
randomly sampled from across the post-burning posterior and returned,
or "MCCT"
and "MAP"
, which stand for 'maximum clade compatibility tree'
and 'maximum a posteri tree' respectively. The MAP is the single tree from the post-burnin
posterior with the highest marginal likelihood. The MCCT is the single tree from the post-burnin
posterior which contains clades with the highest product of posterior probabilities for its
component clades. Thus, the MAP is the best overall tree, while the MCCT may be the best
tree for summarizing topological support.
Logical. If TRUE
, then nodes of the
output tree will be labeled with their respective posterior
probabilities, as calculated based on the frequency of a clade
occurring across the post-burnin posterior tree sample. If FALSE
,
this is skipped.
If TRUE
, this function will also look for, scan, and parse an
associated NEXUS file. Ignoring any commented lines (ie. anything between "[ ]" ), commands
for fixing taxa will be identified, parsed and returned to the user, either as a message
printed to the R console if output is read to a file, or as a attribute named 'fixed ages'
if output as an R object (formatted as a two-column table of OTU names and their respective fixed ages).
If the output is an R object, these objects with
Please note: the code for getFixedTimes = TRUE
contains a while()
loop in it for removing nested series of
square brackets (i.e. treated as comments in NEXUS files). Thus files with
ridiculously nested series of brackets may cause this code to take a while
to complete, or may even cause it to hang.
FALSE
by default.
If TRUE
, and getFixedTimes = TRUE
as well as file = NULL
(such that trees will be assigned within the R memory rather than saved to an external file), the
functions setRootAge
and its wrapper function setRootAges
will be applied to the output
so that all output trees have root.time
elements for use with other functions in paleotree
as well as other packages.
Filename (and possibly path too) to the original NEXUS file for this analysis.
Only tried if getFixedTimes = TRUE
. If NULL
(the default), then this function will
instead try to find a NEXUS file with the same name as implied by the filename used in other inputs. If
this file cannot be found, the function will fail.
Filename (possibly with path) as a character string
leading to a file which will be overwritten with the output trees (or summary tree),
as a NEXUS file. If NULL
(the default), the output will
instead be directly returned by this function.
Depending on argument file
, the output tree or trees is either
returned directly, or instead written out in NEXUS format via
ape's write.NEXUS
function to an external file. The output
will consist either of multiple trees sampled from the post-burnin posterior,
or will consist of a single phylogeny (a summary tree, either the MCCT or the MAP).
If the argument setRootAges = TRUE
is not used,
users are warned that the resulting dated trees will not have $root.time
elements necessary
for comparison against an absolute time-scale. Wile the trees may be scaled to units of absolute
time now, rather than with branch lengths expressed in the rate of character change, the dates
estimated by some phylogenetics functions in R may give inaccurate estimates of when events
occur on the absolute time-scale if all tips are extinct.
This is because most functions for phylogenetics in R (and
elsewhere) will instead presume that the latest tip will be at time 0 (the modern), which
may be wrong if you are using paleotree
for analyzing paleontological datasets
consisting of entirely extinct taxa. This can be solved by using argument getFixedTimes = TRUE
to obtain fixed tip ages, and then scaling the resulting output to absolute time using
the argument setRootAges = TRUE
, which obtains a $root.time
element for each tree
using the functions setRootAge
and setRootAges
(for single and multiple phylogenies).
This function is most useful for dealing with dating analyses in MrBayes, particularly when tip-dating a tree with fossil taxa, as the half-compatibility and all-compatibility summary trees offered by the 'sumt' command in MrBayes can have issues properly portraying summary trees from such datasets.
When the arguments getFixedTimes = TRUE
and setRootAges = TRUE
are used, the resulting output will be scaled to absolute time
with the available fixed ages using functions setRootAge
and setRootAges
(for single and multiple phylogenies).
This is only done if fixed ages are available and if the tree is not being saved to an external file.
Maximum Clade Credibility trees are estimated using the functionmaxCladeCred
in package phangorn.
# NOT RUN {
MCCT <- obtainDatedPosteriorTreesMrB(
runFile = "C:\\myTipDatingAnalysis\\MrB_run_fossil_05-10-17.nex.run1.t",
nRuns = 2, burnin = 0.5,
outputTrees = "MCCT", file = NULL)
MAP <- obtainDatedPosteriorTreesMrB(
runFile = "C:\\myTipDatingAnalysis\\MrB_run_fossil_05-10-17.nex.run1.t",
nRuns = 2, burnin = 0.5, getFixedTimes = TRUE,
outputTrees = "MAP", file = NULL)
# get a root age from the fixed ages for tips
setRootAge(tree = MAP)
#pull a hundred trees randomly from the posterior
hundredRandomlySelectedTrees <- obtainDatedPosteriorTreesMrB(
runFile = "C:\\myTipDatingAnalysis\\MrB_run_fossil_05-10-17.nex.run1.t",
nRuns = 2, burnin = 0.5, getFixedTimes = TRUE,
getRootAges = TRUE,
outputTrees = 100, file = NULL)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab