Usage
oncoSimulIndiv(fp, model = "Exp", numPassengers = 0, mu = 1e-6,
detectionSize = 1e8, detectionDrivers = 4,
sampleEvery = ifelse(model %in% c("Bozic", "Exp"), 1,
0.025),
initSize = 500, s = 0.1, sh = -1,
K = initSize/(exp(1) - 1), keepEvery = sampleEvery,
minDetectDrvCloneSz = "auto",
extraTime = 0,
finalTime = 0.25 * 25 * 365, onlyCancer = TRUE,
keepPhylog = FALSE,
mutationPropGrowth = ifelse(model == "Bozic",
FALSE, TRUE),
max.memory = 2000, max.wall.time = 200,
max.num.tries = 500,
errorHitWallTime = TRUE,
errorHitMaxTries = TRUE,
verbosity = 0,
initMutant = NULL,
seed = NULL)oncoSimulPop(Nindiv, fp, model = "Exp", numPassengers = 0, mu = 1e-6,
detectionSize = 1e8, detectionDrivers = 4,
sampleEvery = ifelse(model %in% c("Bozic", "Exp"), 1,
0.025),
initSize = 500, s = 0.1, sh = -1,
K = initSize/(exp(1) - 1), keepEvery = sampleEvery,
minDetectDrvCloneSz = "auto",
extraTime = 0,
finalTime = 0.25 * 25 * 365, onlyCancer = TRUE,
keepPhylog = FALSE,
mutationPropGrowth = ifelse(model == "Bozic",
FALSE, TRUE),
max.memory = 2000, max.wall.time = 200,
max.num.tries = 500,
errorHitWallTime = TRUE,
errorHitMaxTries = TRUE,
initMutant = NULL,
verbosity = 0,
mc.cores = detectCores(),
seed = "auto")
oncoSimulSample(Nindiv,
fp,
model = "Exp",
numPassengers = 0,
mu = 1e-6,
detectionSize = round(runif(Nindiv, 1e5, 1e8)),
detectionDrivers = {
if(inherits(fp, "fitnessEffects")) {
if(length(fp$drv)) {
nd <- (2: round(0.75 * length(fp$drv)))
} else {
nd <- 9e6
}
} else {
nd <- (2 : round(0.75 * max(fp)))
}
if (length(nd) == 1)
nd <- c(nd, nd)
sample(nd, Nindiv,
replace = TRUE)
},
sampleEvery = ifelse(model %in% c("Bozic", "Exp"), 1,
0.025),
initSize = 500,
s = 0.1,
sh = -1,
K = initSize/(exp(1) - 1),
minDetectDrvCloneSz = "auto",
extraTime = 0,
finalTime = 0.25 * 25 * 365,
onlyCancer = TRUE, keepPhylog = FALSE,
mutationPropGrowth = ifelse(model == "Bozic",
FALSE, TRUE),
max.memory = 2000,
max.wall.time.total = 600,
max.num.tries.total = 500 * Nindiv,
typeSample = "whole",
thresholdWhole = 0.5,
initMutant = NULL,
verbosity = 1,
seed = "auto")
Arguments
Nindiv
Number of individuals or number of different
trajectories to simulate.
fp
Either a poset that specifies the order restrictions (see
poset
if you want to use the specification as in
v.1. Otherwise, a fitnessEffects object (see
allFitnessEffects
). Other arguments below (s, sh, numPassengers) make sense only if you
use a poset, as they are included in the fitnessEffects object.
model
One of "Bozic", "Exp", "McFarlandLog" (the last one can be abbreviated
to "McFL"). The default is "Exp".
numPassengers
This has no effect if you use the allFitnessEffects
specification.
If you use the specification of v.1., the number of passenger
genes. Note that using v.1 the total number of genes (drivers plus
passengers) must be smaller than 64.
All driver genes should be included in the poset (even if they depend
on no one and no one depends on them), and will be numbered from 1 to
the total number of driver genes. Thus, passenger genes will be
numbered from (number of driver genes + 1):(number of drivers + number
of passengers). mu
Mutation rate. See also mutationPropGrowth
.
detectionSize
What is the minimal number of cells for cancer
to be detected. For oncoSimulSample
this can be a vector.
detectionDrivers
The minimal number of drivers (not modules,
drivers, whether or not they are from the same module) present in any
clone for cancer to be detected. For oncoSimulSample
this can be
a vector.For oncoSimulSample
, if there are drivers (either because you are
using a v.1 object or because you are using a fitnessEffects object with
a drvNames
component ---see allFitnessEffects
---)
the default is a vector of drivers from a uniform between 2 and 0.75 the
total number of drivers. If there are no drivers (because you are using
a fitnessEffects object without a drvNames
, either because you
specified it explicitly or because all of the genes are in the
noIntGenes component) the simulations should not stop based on the
number of drivers (and, thus, the default is set to 9e6).
sampleEvery
How often the whole population is sampled. This is not the same as the
interval between successive samples that are kept or stored (for that,
see keepEvery
). For very fast growing clones, you might need to have a small value
here to minimize possible numerical problems (such as huge increase in
population size between two successive samples that can then lead to
problems for random number generators). Likewise, for models with
density dependence (such as McF) this value should be very small.
initSize
Initial population size.
s
Selection coefficient for drivers.
Only relevant if using a poset as this is included in the
fitnessEffects object.
sh
Selection coefficient for drivers with restrictions not satisfied. A
value of 0 means there are no penalties for a driver appearing in a
clone when its restrictions are not satisfied. To specify "sh=Inf" (in Diaz-Uriarte, 2014) use sh = -1.
Only relevant if using a poset as this is included in the
fitnessEffects object.
K
Initial population equilibrium size in the McFarland models.
keepEvery
Time interval between successive whole population samples that are
actually stored. This must be larger or equal to sampleEvery. If keepEvery is
not a multiple integer of sampleEvery, the keepEvery in use will be the
smallest multiple integer of keepEvery larger than the specified
keepEvery. If you want nice plots, set sampleEvery
and keepEvery
to
small values (say, 1 or 0.5). Otherwise, you can use a
sampleEvery
of 1 but a keepEvery
of 15, so that the
return objects are not huge.
minDetectDrvCloneSz
A value of 0 or larger than 0 (by default equal to
initSize
in the McFarland model). If larger than 0, when
checking if we are done with a simulation, we verify that the sum of
the population sizes of all clones that have a number of mutated
drivers larger or equal to detectionDrivers
is larger or equal
to this minDetectDrvCloneSz
. The reason for this parameter is to ensure that, say, a clone with a
certain number of drivers that would cause the simulation to end has
not just appeared and is present in only one individual that might
then immediately go extinct. This can be relevant in secenarios such
as the McFarland model.
See also extraTime
.
extraTime
A value larger than zero waits those many additional time periods
before exiting after having reached the exit condition (population
size, number of drivers). The reason for this setting is to prevent the McFL models from always
exiting at a time when one clone is increasing its size quickly (see
minDetectDrvCloneSz
). By setting an extraTime
larger than 0,
we can sample at points when we are at the plateau.
finalTime
What is the maximum number of time units that the simulation can run.
onlyCancer
Return only simulations that reach cancer? If set to TRUE, only simulations that satisfy the
detectionDrivers
or the detectionSize
requirements will
be returned: the simulation will be repeated, within the limits set by
max.num.tries
and max.wall.time
(and, for
oncoSimulSample
also max.num.tries.total
and
max.wall.time.total
), until one which meets the
detectionDrivers
or detectionSize
is
obtained. Otherwise, the simulation is returned regardless of final
population size or number of drivers in any clone and this includes
simulations where the population goes extinct.
keepPhylog
If TRUE, keep track of when and from which clone each clone is
created. See also plotClonePhylog
. mutationPropGrowth
If TRUE, make mutation rate proportional to growth rate, so clones
that grow faster also mutate faster. Thus, $mutation_rate = mu *
birth_rate$. This is a simple way of approximating that mutation happens
at cell division (it is not strictly making mutation happen at cell division,
since mutation is not strictly coupled with division).
Of course, this only makes sense in models where birth rate changes.
initMutant
For v.2, a string with the mutations of the initial
mutant, if any. This is the same format as for
evalGenotype
. For v.1, the single mutation of the
initial clone for the simulations. The default (if you pass nothing)
is to start the simulation from the wildtype genotype with nothing
mutated. max.num.tries
Only applies when onlyCancer = TRUE
. What is
the maximum number of times, for an individual simulation, we can
repeat the simulation for it to reach cancer? There are certain
parameter settings where reaching cancer is extremely unlikely and you
might not want to run forever in those cases.
max.num.tries.total
Only applies when onlyCancer = TRUE
and for oncoSimulSample
. What is the maximum number of times,
over all simulations for all individuals in a population sample, that
we can repeat the simulations so that cancer is reached for all
individuals? The idea is to set a limit on the average minimal
probability of reaching cancer for a set of simulations to be
accepted.
max.wall.time
Maximum wall time for each individual simulation run. If the
simulation is not done in this time, it is aborted.
max.wall.time.total
Maximum wall time for all the simulations (when using
oncoSimulSample
), in seconds. If the simulation is not
completed in this time, it is aborted. To prevent problems from a
single individual simulation going wild, this limit is also enforced
per simulation (so the run can be aborted directly from C++).
errorHitMaxTries
If TRUE (the default) a simulation that reaches
the maximum number of repetitions allowed is considered not to have
succesfully finished and, thus, an error, and no output from it will
be reported. This is often what you want. See Details
.
errorHitWallTime
If TRUE (the default) a simulation that reaches the maximum wall time
is considered not to have succesfully finished and, thus, an error,
and no output from it will be reported. This is often what you
want. See Details
.
max.memory
The largest size (in MB) of the matrix of Populations by Time. If it
creating it would use more than this amount of memory, it is not
created. This prevents you from accidentally passing parameters that
will return an enormous object.
verbosity
If 0, run as silently as possible. Otherwise, increasing values of
verbosity provide progressively more information about intermediate
steps, possible numerical notes/warnings from the C++ code, etc.
typeSample
"singleCell" (or "single") for single cell sampling, where the
probability of sampling a cell (a clone) is directly proportional to
its population size. "wholeTumor" (or "whole") for whole tumor
sampling (i.e., this is similar to a biopsy being the entire
tumor). See samplePop
. thresholdWhole
In whole tumor sampling, whether a gene is detected as mutated depends
on thresholdWhole: a gene is considered mutated if it is altered in at
least thresholdWhole proportion of the cells in that individual. See samplePop
. mc.cores
Number of cores to use when simulating more than one
individual (i.e., when calling oncoSimulPop).
seed
The seed for the C++ PRNG. You can pass a value. If you set
it to NULL, then a seed will be generated in R and passed to C++. If
you set it to "auto", then if you are using v.1, the behavior is the
same as if you set it to NULL (a seed will be generated in R and
passed to C++) but if you are using v.2, a random seed will be
produced in C++. If you need reproducibility, either pass a value or set it to NULL (setting
it to NULL will make the C++ seed reproducible if you use the same
seed in R via set.seed
). However, even using the same value of
seed
is unlikely to give the exact same results between
platforms and compilers. Moreover, note that the defaults for
seed
are not the same in oncoSimulIndiv
,
oncoSimulPop
and oncoSimulSample
. When using oncoSimulPop, if you want reproducibility, you might want
to, in addition to setting seed = NULL
, also do
RNGkind("L'Ecuyer-CMRG")
as we use
mclapply
; look at the vignette of
parallel.