simFossilTaxa_SRCond(r, avgtaxa, p, q, anag.rate = 0, prop.bifurc = 0,
prop.cryptic = 0, nruns = 1, mintime = 1, maxtime = 1000,
minExtant = 0, maxExtant = NULL, count.cryptic = FALSE,
print.runs = FALSE, plot = FALSE)simFossilTaxa so please the help file for
that function in addition.
simFossilTaxa_SRCond is a wrapper for simFossilTaxa for
when clades of a particular size are desired,
post-sampling. simFossilTaxa simulates a birth-death
process (Kendall, 1948; Nee, 2006), but unlike most
functions for this implemented in R, this function enmeshes
the simulation of speciation and extinction with explicit
models of how lineages are morphologically differentiated,
as morphotaxa are the basic units of paleontological
estimates of diversity and phylogenetics. For more details
on the workings of simFossilTaxa and many of the arguments
involved, please see the help file for simFossilTaxa
(simFossilTaxa). simFossilTaxa_SRCond first
calculates the expected proportion of taxa sampled, given
the sampling rate and the rates which control lineage
termination: extinction, anagenesis and bifurcation. The
average original clade size needed to produce, on average,
a given number of sampled taxa (the argument 'avgtaxa') is
calculated with the following equation:
$N = avgtaxa / (r / (r + (q + anag.rate + (prop.bifurc
* p)) ))$
We will call that quantity N. Note that the quantity
(prop.bifurc * p) describes the rate of bifurcation when
there is no cryptic cladogenesis, as prop.bifurc is the
ratio of budding to bifurcating cladogenesis. This equation
will diverge in ways that are not easily predicted as the
rate of cryptic speciation increases. Note, as of version
1.5, this equation was altered to the form above. The
previous form was similar and at values of avgtaxa greater
than 10 or so, produces almost identical values. The above
is preferred for its relationship to taxonomic completeness
(Solow and Smith, 1997).
Next, this value is used with simFossilTaxa, where mintaxa
is set to N and maxtaxa set to 2*N. simFossilTaxa_SRcond
will generally produce simulated datasets that are
generally of that size or larger post-sampling (although
there can be some variance). Some combinations of
parameters may take an extremely long time to find large
enough datasets. Some combinations may produce very strange
datasets that may have weird structure that is only a
result of the conditioning (for example, the only clades
that have many taxa when net diversification is low or
negative will have lots of very early divergences, which
could impact analyses). Needless to say, conditioning can
be very difficult.simFossilTaxa, sampleRanges,
simPaleoTrees, taxa2phylo,
taxa2cladogramset.seed(444)
avgtaxa <- 50
r <- 0.5
#using the SRcond version
taxa <- simFossilTaxa_SRCond(r=r,p=0.1,q=0.1,nruns=20,avgtaxa=avgtaxa)
#now let's use sampleRanges and count number of sampled taxa
ranges <- lapply(taxa,sampleRanges,r=r)
ntaxa <- sapply(ranges,function(x) sum(!is.na(x[,1])))
hist(ntaxa)
mean(ntaxa)
#works okay... some parameter combinations are difficult to get right number of taxa
layout(1)Run the code above in your browser using DataLab