Learn R Programming

paleotree (version 2.5)

simFossilRecord: Full-Scale Simulations of the Fossil Record with Birth, Death and Sampling of Morphotaxa

Description

A complete birth-death-sampling branching simulator that captures morphological-taxon identity of lineages, as is typically discussed in models of paleontological data. This function allows for the use of precise point constraints to condition simulation run acceptance and can interpret complex character strings given as rate values for use in modeling complex processes of diversification and sampling.

Usage

simFossilRecord(p, q, r = 0, anag.rate = 0, prop.bifurc = 0,
  prop.cryptic = 0, modern.samp.prob = 1, startTaxa = 1, nruns = 1,
  totalTime = c(0, 1000), nTotalTaxa = c(1, 1000), nExtant = c(0, 1000),
  nSamp = c(0, 1000), tolerance = 10^-4, maxStepTime = 0.01,
  shiftRoot4TimeSlice = "withExtantOnly", count.cryptic = FALSE,
  negRatesAsZero = TRUE, print.runs = FALSE, sortNames = FALSE,
  plot = FALSE)

Arguments

p,q,r,anag.rate
These parameters control the instantaneous ('per-capita') rates of branching, extinction, sampling and anagenesis, respectively. These can be given as a number equal to or greater than zero, or as a character string which will be interpreted as an algeb
prop.cryptic,prop.bifurc
These parameters control (respectively) the proportion of branching events that have morphological differentiation, versus those that are cryptic (prop.cryptic) and the proportion of morphological branching events that are bifurcating, as o
modern.samp.prob
The probability that a taxon is sampled at the modern time (or, for timeSliceFossilRecord, the time at which the simulation data is slice). Must be a number between 0 and 1. If 1, all taxa that survive to the modern day (to the slice
startTaxa
Number of initital taxa to begin a simulation with. All will have the simulation start date listed as their time of origination.
nruns
Number of simulation datasets to accept, save and output. If nruns = 1, output will be a single object of class fossilRecordSimulation, and if nruns is greater than 1, a list will be output composed of nruns<
totalTime,nTotalTaxa,nExtant,nSamp
These arguments represent stopping and acceptance conditions for simulation runs. They are respectively totalTime, the total length of the simulation in time-units, nTotalTaxa, the total number of taxa over the past evolutiona
tolerance
A small number which defines a tiny interval for the sake of placing run-sampling dates before events and for use in determining whether a taxon is extant in simFossilRecordMethods.
maxStepTime
When rates are time-dependent (i.e. when parameters 'D' or 'T' are used in equations input for one of the four rate arguments), then protocol used by simFossilRecord of drawing waiting times to the next event could produce a serious mismatc
shiftRoot4TimeSlice
Should the dating of events be shifted, so that the date given for sliceTime is now 0, or should the dates not be shifted, so that they remain on the same scale as the input? This argument accepts a logical TRUE or FALSE, but also accepts
count.cryptic
If TRUE, cryptic taxa are counted as separate taxa for conditioning limits that count a number of taxon units, such as nTotalTaxa, nExtant and nSamp. If FALSE (the default), then each cryptic complex (i.e. each di
negRatesAsZero
A logical. Should rates calculated as a negative number cause the simulation to fail with an error message (= FALSE) or should these be treated as zero ("= TRUE", the default). This is equivalent to saying that the rate.a
print.runs
If TRUE, prints the proportion of simulations accepted for output to the terminal.
sortNames
If TRUE, output taxonomic lists are sorted by the taxon names (thus sorting cryptic taxa together) rather than by taxon ID number (i.e. the order they were simulated in).
plot
If TRUE, plots the diversity curves of accepted simulations, including both the diversity curve of the true number of taxa and the diversity curve for the 'observed' (sampled) number of taxa.

Value

  • simFossilRecord returns either a single object of class fossilRecordSimulation or a list of multiple such objects, depending on whether nruns was 1 or more. An object of class fossilRecordSimulation consists of a list object composed of multiple elements, each of which is data for 'one taxon'. Each data element for each taxon is itself a list, composed of two elements: the first describes vital information about the taxon unit, and the second describes the sampling times of each taxon. The first element of the list (named $taxa.data) is a distinctive six-element vector composed of numbers (some are nominally integers, but not all, so all are stored as double-precision integers) with the following field names: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object] The second element for each taxon-unit is a vector of sampling times, creatively named $sampling.times, with each value representing a data in absolute time when that taxon was sampled in the simulated fossil record. If a taxon was never sampled, this vector is an empty numeric vector of length = 0. As is typical for paleontological uses of absolute time, absolute time in these simulations is always decreasing toward the modern; i.e. an absolute date of 50 means a point in time which is 50 time-units before the present-day, if the present-day is zero (the default, but see argument shiftRoot4TimeSlice). Each individual element of a fossilRecordSimulation list object is named, generally of the form "t1" and "t2", where the number is the taxon.id. Cryptic taxa are instead named in the form of "t1.2" and "t5.3", where the first number is the taxon which they are a cryptic descendant of (looks.like) and the second number, after the period, is the order of appearance of lineage units in that cryptic complex. For example, for "t5.3", the first number is the taxon.id and the second number communicates that this is the third lineage to appear in this cryptic complex.

Details

simFossilRecord simulates a birth-death-sampling branching process (ala Foote, 1997, 2000; Stadler, 2009) in which lineages of organisms may branch, go extinct or be sampled thought a continuous time-interval, with the occurrence of these events modeled as Poisson process controlled by some set of instantaneous rates. This model is ultimately based on the birth-death model (Kendall, 1948; Nee, 2006), which is widely implemented in many R packages. Unlike other such typical branching simulators, this function enmeshes the lineage units within explicit models of how lineages are morphologically differentiated (Bapst, 2013). This is key to allow comparison to datasets from the fossil record, as morphotaxa are the basic units of paleontological diversity estimates and phylogenetic analyses. Models of Morphological Differentiation and Branching (Cladogenesis and Anagenesis) These models of morphological differentiation do not involve the direct simulation of morphological traits. Instead, morphotaxon identity is used as a proxy of the distinctiveness of lineages on morphological grounds, as if there was some hypothetical paleontologist attempting to taxonomically sort collections of specimens of these simulated lineages. Two lineages are either identical, and thus share the same morphotaxon identity, or they are distinct, and thus have separate morphotaxon identities. Morphological differentiation is assumed to be an instantaneous process for the purposes of this model, such that no intermediate could be uncovered. Specifically, simFossilRecord allows for three types of binary branching events (here grouped under the term 'cladogenesis': 'budding cladogenesis', 'bifurcating cladogenesis' and 'cryptic cladogenesis', as well as for a fourth event-type, 'anagenesis' (see Wagner and Erwin, 1995; Foote, 1996, and Bapst, 2013, for further details). Budding, bifurcation and cryptic cladogenetic events all share in common that a single geneological lineage splits into two descendant lineages, but differ in the morphological differentiation of these child lineages relative to their parent. Under budding cladogenesis, only one of the child lineages becomes morphologically distinguishable from the parent, and thus the ancestral morphotaxon persists through the branching event as the un-differtiated child lineage. Under bifurcating cladogenesis, both child lineages become immediately distinct from the ancestor, and thus two new morphotaxa appear while the ancestor terminates in an event known as 'pseudoextinction'. Crytic cladogenesis has no morphological differentiation: both child lineages are presumed to be indistinct from the ancestor and from each other, which means a hypothetical differentiation independent of any branching, such that a morphotaxon instanteously transitions to a new morphotaxon identity, resulting in the pseudoextinction of the ancestral morphotaxon and the immediate 'pseudospeciation' of the child morphotaxon. The two morphotaxa do not overlap in time at all, as modeled here (contra to the models described by Ezard et al., 2012). For ease of following these cryptic lineages, cryptic cladogenetic events are treated in terms of data structure similarly to budding cladogenetic events, with one child lineage treated as a persistance of the ancestral lineage, and the other as a new morphologically indistinguishable lineage. This model of cryptic cladogenesis is ultimately based on the hierarchical birth-death model used by many authors for modeling patterns across paraphyletic higher taxa and the lower taxon units within them (e.g. Patzkowsky, 1995; Foote, 2012). The occurrence of the various models is controlled by multiple arguments of simFossilRecord. The overall instantaneous rate of branching (cladogenesis) is controlled by argument p, and the proportion of each type of cladogenesis controlled by arguments prop.bifurc and prop.cryptic. prop.cryptic controls the overall probability that any branching event will be cryptic versus involving any morphological differentiation (budding or bifurcating). If prop.cryptic = 1, all branching events will be cryptic cladogenesis, and if prop.cryptic = 0, all branching events will involve morphological differentiation and none will be cryptic. prop.bifurc controls how many branching events that involve morphological differentiation (i.e. the inverse of prop.cryptic) are bifurcating, as opposed to budding cladogenesis. If prop.bifurc = 1, all morphologically-differentiating branching events will be bifurcating cladogenesis, and if prop.bifurc = 0, all morphologically-differentiating branching events will be budding cladogenesis. Thus, for example, the probability of a given cladogenesis event being budding is given by: Prob(budding cladogenesis at a branching event) = (1 - prop.cryptic) * (1 - prop.bifurc) By default, prop.cryptic = 0 and prop.bifurc = 0, so all branching events by default will be instances of budding cladogenesis. Anagenesis is completely independent of these, controlled as its own Poisson process with an instantaneous rated defined by the argument anag.rate. By default, this rate is set to zero and thus there is no anagenetic events without user intervention. Stopping Conditions and Acceptance Criteria for Simulations How forward-time simulations are generated, halted and whether they are accepted or not for output is a critical component of simulation design. Most uses of simFossilRecord will involve iteratively generating and analyzing multiple simulation runs. Runs are only accepted for output if they meet the conditioning criteria defined in the arguments, either matching point constraints or falling within range constraints. However, this requires separating the processes of halting simulation runs and accepting a run for output, particularly to avoid bias related to statistical sampling issues. Hartmann et al. (2011) recently discovered a potential statistical artifact when branching simulations are conditioned on some number of taxa. Previously, within paleotree, this was accounted for in simFossilTaxa by a complex arrangement of minimum and maximum constraints, and an (incorrect) presumption that allowing simulations to continue for a short distance after constraints were reached. This strategy is not applied here. Instead, simFossilRecord applies the General Sampling Algorithm presented by Hartmann et al. (or at least, a close variant). A simulation continues until extinction or some maximum time-constraint is reached, evaluated for intervals that match the set run conditions (e.g. nExtant, nTotalTime) and, if some interval or set of intervals matches the run conditions, a date is randomly sampled from within this interval/intervals. The simulation is then cut at this date using the timeSliceFossilRecord function, and saved as an accepted run. The simulation data is otherwise discarded and then a new simulation initiated (thus, at most, only one simulated dataset is accepted from one simulation run). Thus, accepted simulations runs should reflect unbiased samples of evolutionary histories that precisely match the input constraints, which can be very precise, unlike how stopping and acceptance conditions were handled in the previous simFossilTaxa function. Of course, selecting very precise constraints that are very unlikely or impossible given other model parameters may take considerable computation time to find acceptable simulation runs, or effectively never find any acceptable simulation runs. On Timescale Used in Output Dates given in the output are on an reversed absolute time-scale; i.e. time decreases going from the past to the future, as is typical in paleontological uses of time (as time before present) and as for most function in package paleotree. The endpoints of the time-scale are decided by details of the simulation and can be modified by several arguments. By default (with shiftRoot4TimeSlice = "withExtantOnly"), any simulation run that is accepted with extant taxa will have zero as the end-time (i.e. when those taxa are extant), as zero is the typical time assigned to the modern day in empirical studies. If a simulation ends with all taxa extinct, however, then instead the start-time of a run (i.e. when the run initiates with starting taxa) will be maximum value assigned to the conditioning argument totalTime. If shiftRoot4TimeSlice = FALSE then the start-time of the run will always be this maximum value for totalTime, and any extant taxa will stop at some time greater than zero.

References

Bapst, D. W. 2013. When Can Clades Be Potentially Resolved with Morphology? PLoS ONE 8(4):e62312. Ezard, T. H. G., P. N. Pearson, T. Aze, and A. Purvis. 2012. The meaning of birth and death (in macroevolutionary birth-death models). Biology Letters 8(1):139-142. Foote, M. 1996 On the Probability of Ancestors in the Fossil Record. Paleobiology 22(2):141--151. Foote, M. 1997. Estimating Taxonomic Durations and Preservation Probability. Paleobiology 23(3):278-300. Foote, M. 2000. Origination and extinction components of taxonomic diversity: general problems. Pp. 74-102. In D. H. Erwin, and S. L. Wing, eds. Deep Time: Paleobiology's Perspective. The Paleontological Society, Lawrence, Kansas. Foote, M. 2012. Evolutionary dynamics of taxonomic structure. Biology Letters 8(1):135-138. Gavryushkina, A., D. Welch, T. Stadler, and A. J. Drummond. 2014. Bayesian Inference of Sampled Ancestor Trees for Epidemiology and Fossil Calibration. PLoS Comput Biol. 10(12):e1003919. Hartmann, K., D. Wong, and T. Stadler. 2010 Sampling Trees from Evolutionary Models. Systematic Biology 59(4):465--476. Heath, T. A., J. P. Huelsenbeck, and T. Stadler. 2014. The fossilized birth-death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences 111(29):E2957-E2966. Kendall, D. G. 1948 On the Generalized "Birth-and-Death" Process. The Annals of Mathematical Statistics 19(1):1--15. Nee, S. 2006 Birth-Death Models in Macroevolution. Annual Review of Ecology, Evolution, and Systematics 37(1):1--17. Patzkowsky, M. E. 1995. A Hierarchical Branching Model of Evolutionary Radiations. Paleobiology 21(4):440-460. Solow, A. R., and W. Smith. 1997 On Fossil Preservation and the Stratigraphic Ranges of Taxa. Paleobiology 23(3):271--277. Stadler, T. 2009. On incomplete sampling under birth-death models and connections to the sampling-based coalescent. Journal of Theoretical Biology 261(1):58-66. Wagner, P. J., and D. H. Erwin. 1995. Phylogenetic patterns as tests of speciation models. Pp. 87-122. In D. H. Erwin, and R. L. Anstey, eds. New approaches to speciation in the fossil record. Columbia University Press, New York.

See Also

simFossilRecordMethods This function essentially replaces and adds to all functionality of the paleotree functions simFossilTaxa, simFossilTaxaSRCond, simPaleoTrees, as well as the combined used of simFossilTaxa and sampleRanges for some models of sampling.

Examples

Run this code
set.seed(2)

# quick birth-death-sampling run with 1 run, 50 taxa

record <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=1,
	nTotalTaxa=50, plot=TRUE)

# examining multiple runs of simulations

#example of repeated pure birth simulations over 50 time-units
records <- simFossilRecord(p=0.1, q=0, nruns=10,
	totalTime=50, plot=TRUE)
#plot multiple diversity curves on a log scale
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE,plotLogRich=TRUE)
#histogram of total number of taxa
hist(sapply(records,nrow))

#example of repeated birth-death-sampling simulations over 50 time-units
records <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=10,
	totalTime=50, plot=TRUE)
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE)

#like above, but conditioned instead on having 10 extant taxa
	# between 1 and 100 time-units
set.seed(4)
records <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=10,
	totalTime=c(1,300), nExtant=10, plot=TRUE)
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE)

################################################

# How probable were the runs I accepted?
	# The effect of conditions

set.seed(1)

# Let's look at an example of a birth-death process
	# with high extinction relative to branching
# use default run conditions (barely any conditioning)
# use print.runs to look at acceptance probability
records <- simFossilRecord(p=0.1, q=0.8, nruns=10,
	print.runs=TRUE, plot=TRUE)
# 10 runs accepted from a total of 10 !

# now let's give much more stringent run conditions
	# require 3 extant taxa at minimum, 5 taxa total minimum
records <- simFossilRecord(p=0.1, q=0.8, nruns=10,
	nExtant=c(3,100), nTotalTaxa=c(5,100),
	print.runs=TRUE, plot=TRUE)
# thousands of simulations to just obtail 10 accepable runs!
	# most ended in extinction before minimums were hit

# beware analysis of simulated where acceptance conditions
	# are too stringent: your data will be a 'special case'
	# of the simulation parameters
# it will also take you a long time to generate reasonable
	# numbers of replicates for whatever analysis you are doing

# TLDR: You should look at print.runs=TRUE

######################

# Using the rate equation-input for complex diversification models

# First up... Diversity Dependent Models!
# Let's try Diversity-Dependent Branching over 50 time-units

# first, let's write the rate equation
# We'll use the diversity dependent rate equation model
	# from Ettienne et al. 2012 as an example here
	# Under this equation, p=q at carrying capacity K
# Many others are possible!
# Note that we don't need to use max(0,rate) as negative rates
	# are converted to zero by default, as controlled by
	# the argument negRatesAsZero

# From Ettiene et al.
#	lambda = lambda0 - (lambda0 - mu)*(n/K)
# lambda and mu are branching rate and extinction rate
	# lambda and mu == p and q in paleotree (i.e. Foote convention)
# lambda0 is the branching rate at richness=0
# K is the carrying capacity
# n is the richness

# 'N' is the algebra symbol for standing taxonomic richness
	# for simFossilRecord's simulation capabilities
# also branching rate cannot reference extinction rate
# we'll have to set lambda0, mu and K in the rate equation directly

lambda0 <- 0.3	# branching rate at 0 richness in Ltu
K <- 40		# carrying capacity
mu <- 0.1		# extinction rate will 0.1 Ltu (= 1/3 of lambda0)

# technically, mu here represents the lambda at richness=K
	# i.e. lambdaK
# Ettienne et al. are just implicitly saying that the carrying capacity
	# is the richness at which lambda==mu

# construct the equation programmatically using paste0
branchingRateEq<-paste0(lambda0,"-(",lambda0,"-",mu,")*(N/",K,")")
# and take a look at it...
branchingRateEq
# its a thing of beauty, folks

# now let's try it
records <- simFossilRecord(p=branchingRateEq, q=mu, nruns=3,
	totalTime=100, plot=TRUE, print.runs=TRUE)
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE)
# those are some happy little diversity plateaus!


# now let's do diversity-dependent extinction

# let's slightly modify the model from Ettiene et al.
#	mu = mu0 + (mu0 - muK)*(n/K)

mu0<-0.001		# mu at n=0
muK<-0.1		# mu at n=K (should be equal to lambda at K)
K<-40
lambda<-muK		# equal to muK

# construct the equation programmatically using paste0
extRateEq<-paste0(mu0,"-(",mu0,"-",muK,")*(N/",K,")")
extRateEq

# now let's try it
records <- simFossilRecord(p=lambda, q=extRateEq, nruns=3,
	totalTime=100, plot=TRUE, print.runs=TRUE)
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE)

# these plateaus looks a little more spiky
	#( maybe there is more turnover at K? )
# also, it took a longer for the rapid rise to occur

# Now let's try an example with time-dependent origination
	# and extinction constrained to equal origination

# Note! Use of time-dependent parameters "D" and "T" may
# result in slower than normal simulation run times
# as the time-scale has to be discretized; see
# info for argument maxTimeStep above

# First, let's define a time-dependent rate equation
	# "T" is the symbol for time passed
timeEquation<-"0.4-(0.007*T)"

#in this equation, 0.4 is the rate at time=0
	# and it will decrease by 0.007 with every time-unit
	# at time=50, the final rate will be 0.05
# We can easily make it so extinction is always equal to branching rate
# "P" is the algebraic equivalent for "branching rate" in simFossilRecord

# now let's try it
records <- simFossilRecord(p=timeEquation, q="P", nruns=3,
	totalTime=50, plot=TRUE, print.runs=TRUE)
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE)
# high variability that seems to then smooth out as turnover decreases

# And duration what about duration-dependent processes?
		# let's do a duration-dep extinction equation:
durDepExt<-"0.01+(0.01*D)"

# okay, let's take it for a spin
records <- simFossilRecord(p=0.1, q=durDepExt, nruns=3,
	totalTime=50, plot=TRUE, print.runs=TRUE)
records<-lapply(records,fossilRecord2fossilTaxa)
multiDiv(records,plotMultCurves=TRUE)
# creates runs full of short lived taxa

##########################################################

# Speciation Modes
# Some examples of varying the 'speciation modes' in simFossilRecord

# The default is pure budding cladogenesis
	# anag.rate = prop.bifurc = prop.cryptic = 0
# let's just set those for the moment anyway
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=0,
	nruns=1, nTotalTaxa=c(20,30) ,nExtant=0, plot=TRUE)

#convert and plot phylogeny
	# note this will not reflect the 'budding' pattern
	# branching events will just appear like bifurcation
	# its a typical convention for phylogeny plotting
converted<-fossilRecord2fossilTaxa(record)
tree<-taxa2phylo(converted,plot=TRUE)

#now, an example of pure bifurcation
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=1, prop.cryptic=0,
	nruns=1, nTotalTaxa=c(20,30) ,nExtant=0)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record),plot=TRUE)

# all the short branches are due to ancestors that terminate
	# via pseudoextinction at bifurcation events

# an example with anagenesis = branching
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0.1, prop.bifurc=0, prop.cryptic=0,
	nruns=1, nTotalTaxa=c(20,30) ,nExtant=0)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record),plot=TRUE)
# lots of pseudoextinction

# an example with anagenesis, pure bifurcation
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0.1, prop.bifurc=1, prop.cryptic=0,
	nruns=1, nTotalTaxa=c(20,30) ,nExtant=0)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record),plot=TRUE)
# lots and lots of pseudoextinction

# an example with half cryptic speciation
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=0.5,
	nruns=1, nTotalTaxa=c(20,30), nExtant=0)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record),plot=TRUE)

# notice that the tree has many more than the maximum of 30 tips:
	# that's because the cryptic taxa are not counted as
	# separate taxa by default, as controlled by count.cryptic

# an example with anagenesis, bifurcation, cryptic speciation
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0.1, prop.bifurc=0.5, prop.cryptic=0.5,
	nruns=1, nTotalTaxa=c(20,30), nExtant=0)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record),plot=TRUE)
# note in this case, 50% of branching is cryptic
	# 25% is bifurcation, 25% is budding

# an example with anagenesis, pure cryptic speciation
	# morphotaxon identity will thus be entirely indep of branching!
	# I wonder if this is what is really going on, sometimes...
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0.1, prop.bifurc=0, prop.cryptic=1,
	nruns=1, nTotalTaxa=c(20,30), nExtant=0)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record),plot=TRUE)

#############

# playing with count.cryptic with simulations of pure cryptic speciation

#can choose to condition on total morphologically-distinguishable taxa
    #or total taxa including cryptic taxa with count.cryptic=FALSE

# an example with pure cryptic speciation with count.cryptic=TRUE
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=1,
	nruns=1, totalTime=50, nTotalTaxa=c(10,100), count.cryptic=TRUE)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record))
plot(tree);axisPhylo()
# notice how the tip labels indicate all are the same morphotaxon

# we'll replace the # of taxa constraints with a time constraint
	# or else the count.cryptic=FALSE simulation will never end!

# an example with pure cryptic speciation with count.cryptic=FALSE
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=1,
	nruns=1, totalTime=50, count.cryptic=FALSE)
tree<-taxa2phylo(fossilRecord2fossilTaxa(record))
plot(tree);axisPhylo()

#let's look at numbers of taxa returned when varying count.cryptic
	# with prop.cryptic=0.5

#simple simulation going for 50 total taxa

#first, count.cryptic=FALSE (default)
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=0.5,
	nruns=1, nTotalTaxa=50, count.cryptic=FALSE)
taxa<-fossilRecord2fossilTaxa(record)
nrow(taxa)                 		#number of lineages (inc. cryptic)
length(unique(taxa[,6]))            #number of morph-distinguishable taxa

# and count.cryptic=TRUE
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=0.5,
	nruns=1, nTotalTaxa=50, count.cryptic=TRUE)
taxa<-fossilRecord2fossilTaxa(record)
nrow(taxa)                 		#number of lineages (inc. cryptic)
length(unique(taxa[,6]))            #number of morph-distinguishable taxa

# okay...
# now let's try with 50 extant taxa

#first, count.cryptic=FALSE (default)
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=0.5,
	nruns=1, nExtant=10, totalTime=c(1,100), count.cryptic=FALSE)
taxa<-fossilRecord2fossilTaxa(record)
sum(taxa[,5])             		  	#number of still-living lineages (inc. cryptic)
length(unique(taxa[taxa[,5]==1,6]))	   	#number of still-living morph-dist. taxa

# and count.cryptic=TRUE
record <- simFossilRecord(p=0.1, q=0.1, r=0.1,
	anag.rate=0, prop.bifurc=0, prop.cryptic=0.5,
	nruns=1, nExtant=10, totalTime=c(1,100), count.cryptic=TRUE)
taxa<-fossilRecord2fossilTaxa(record)
sum(taxa[,5])             		  	#number of still-living lineages (inc. cryptic)
length(unique(taxa[taxa[,5]==1,6]))	   	#number of still-living morph-dist. taxa

#################################################

# an example using startTaxa to have more initial taxa
record <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=1,
	nTotalTaxa=100, startTaxa=20, plot=TRUE)

######################################################

# Using run conditions

# Users can generate datasets that meet multiple conditions:
	# such as time, number of total taxa, extant taxa, sampled taxa
# These can be set as point conditions or ranges

# let's set time = 10-100 units, total taxa = 30-40, extant = 10
	#and look at acceptance rates with print.run
record <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=1,
	totalTime=c(10,100), nTotalTaxa=c(30,40), nExtant=10,
	print.runs=TRUE, plot=TRUE)

# let's make the constraints on totaltaxa a little tighter
record <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=1,
	totalTime=c(50,100), nTotalTaxa=30, nExtant=10,
	print.runs=TRUE, plot=TRUE)
# still okay acceptance rates

# alright, now let's add a constraint on sampled taxa
record <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=1,
	totalTime=c(50,100), nTotalTaxa=30, nExtant=10,
	nSamp=15, print.runs=TRUE, plot=TRUE)
# still okay acceptance rates

########################################################

# Simulations of entirely extinct taxa

#Typically, a user may want to condition on a precise
	# number of sampled taxa in an all-extinct simulation
record <- simFossilRecord(p=0.1, q=0.1, r=0.1, nruns=1,
	nTotalTaxa=c(1,100), nExtant=0, nSamp=20,
	print.runs=TRUE, plot=TRUE)

# Note that when simulations don't include
# sampling or extant taxa, the plot
# functionality changes
record <- simFossilRecord(p=0.1, q=0.1, r=0, nruns=1,
	nExtant=0, print.runs=TRUE, plot=TRUE)
# something similar happens when there is no sampling
# and there are extant taxa but they aren't sampled
record <- simFossilRecord(p=0.1, q=0.1, r=0, nruns=1,
	nExtant=10, nTotalTaxa=100, modern.samp.prob=0,
	print.runs=TRUE, plot=TRUE)


# We can set up a test to make sure that no extant taxa somehow get
# returned in many simulations with extinct-only conditioning:
res<-simFossilRecord(p=0.1, q=0.1, r=0.1,nTotalTaxa=10,nExtant=0,nruns=1000,plot=TRUE)
anyLive<-any(sapply(res,function(z) any(sapply(z,function(x) x[[1]][5]==1))))
if(anyLive){
	stop("Runs have extant taxa under conditioning for none?")
	}

Run the code above in your browser using DataLab