pqr2Ps: Joint Probability of A Clade Surviving Infinitely or Being Sampled Once
Description
Given the rates of branching, extinction and sampling,
calculates the joint probability of a random clade (of
unknown size, from 1 to infinite) either (a) never going
extinct on an infinite time-scale or (b) being sampled at
least once, if it does ever go extinct. As we often assume
perfect or close to perfect sampling at the modern (and
thus we can blanket assume that living groups are sampled),
we refer to this value as the Probability of Being Sampled,
or simply P(s). This quantity is useful for calculating the
probability distributions of waiting times that depend on a
clade being sampled (or not).
Usage
pqr2Ps(p, q, r, useExact = TRUE)
Arguments
useExact
If TRUE, an exact solution developed by
Emily King is used; if FALSE, an iterative, inexact
solution is used, which is somewhat slower (in addition
to being inexact...).
p
Instantaneous rate of speciation (lambda). If
the underlying model assumed is anagenetic (e.g.
taxonomic change within a single lineage, 'phyletic
evolution') with no branching of lineages, then p will be
used as the rate of anagenetic differe
q
Instantaneous rate of extinction (mu)
r
Instantaneous rate of sampling
Value
Returns a single numerical value, representing the joint
probability of a clade generated under these rates either
never going extinct or being sampled before it goes
extinct.
Details
Note that the use of the word 'clade' here can mean a
monophyletic group of any size, including a single
'species' (i.e. a single phylogenetic branch) that goes
extinct before producing any descendants. Many scientists I
have met reserve the word 'clade' for only groups that
contain at least one branching event, and thus contain two
'species'. I personally prefer to use the generic term
'lineage' to refer to monophyletic groups of one to
infinity members, but others reserve this term for a set of
morphospecies that reflect an unbroken anagenetic chain.
Obviously the equation used makes assumptions about prior
knowledge of the time-scales associated with clades being
extant or not: if we're talking about clades that
originated a short time before the recent, the clades that
will go extinct on an infinite time-scale probably haven't
had enough time to actually go extinct. On reasonably long
time-scales, however, this infinite assumption should be
reasonable approximation, as clades that survive 'forever'
in a homogenous birth-death scenario are those that get
very large immediately (similarly, most clades that go
extinct also go extinct very shortly after originating...
yes, life is tough).
Both an exact and inexact (iterative) solution is offered;
the exact solution was derived in an entirely different
fashion but seems to faithfully reproduce the results of
the inexact solution and is much faster. Thus, the exact
solution is the default. As it would be very simple for any
user to look this up in the code anyway, here's the
unpublished equation for the exact solution:
$Ps = 1-(((p+q+r)-(sqrt(((p+q+r)^2)-(4*p*q))))/(2*p))$
References
Bapst, D. W., E. A. King and M. W. Pennell. In prep.
Probability models for branch lengths of paleontological
phylogenies.
Bapst, D. W. 2013. A stochastic rate-calibrated method for
time-scaling phylogenies of fossil taxa. Methods in
Ecology and Evolution. 4(8):724-733.