This internal function simulates a new dataset containing covariates, outcome probabilities, competing event probabilities (if any), outcomes, and competing events (if any) based on an observed dataset and a user-specified intervention.
simulate(
o,
fitcov,
fitY,
fitD,
ymodel_predict_custom,
yrestrictions,
compevent_restrictions,
restrictions,
outcome_name,
compevent_name,
time_name,
intvars,
interventions,
int_times,
histvars,
histvals,
histories,
comprisk,
ranges,
outcome_type,
subseed,
obs_data,
time_points,
parallel,
covnames,
covtypes,
covparams,
covpredict_custom,
basecovs,
max_visits,
baselags,
below_zero_indicator,
min_time,
show_progress,
pb,
int_visit_type,
sim_trunc,
...
)
A data table containing simulated data under the specified intervention.
Integer specifying the index of the current intervention.
List of model fits for the time-varying covariates.
Model fit for the outcome variable.
Model fit for the competing event variable, if any.
Function obtaining predictions from the custom outcome model specified in ymodel_fit_custom
. See the vignette "Using Custom Outcome Models in gfoRmula" for details.
List of vectors. Each vector containins as its first entry
a condition and its second entry an integer. When the
condition is TRUE
, the outcome variable is simulated
according to the fitted model; when the condition is FALSE
,
the outcome variable takes on the value in the second entry.
List of vectors. Each vector containins as its first entry
a condition and its second entry an integer. When the
condition is TRUE
, the competing event variable is simulated
according to the fitted model; when the condition is FALSE
,
the competing event variable takes on the value in the
second entry.
List of vectors. Each vector contains as its first entry a covariate for which
a priori knowledge of its distribution is available; its second entry a condition
under which no knowledge of its distribution is available and that must be TRUE
for the distribution of that covariate given that condition to be estimated via a parametric
model or other fitting procedure; its third entry a function for estimating the distribution
of that covariate given the condition in the second entry is false such that a priori knowledge
of the covariate distribution is available; and its fourth entry a value used by the function in the
third entry. The default is NA
.
Character string specifying the name of the outcome variable in obs_data
.
Character string specifying the name of the competing event variable in obs_data
.
Character string specifying the name of the time variable in obs_data
.
List, whose elements are vectors of character strings. The kth vector in intvars
specifies the name(s) of the variable(s) to be intervened
on in each round of the simulation under the kth intervention in interventions
.
List, whose elements are lists of vectors. Each list in interventions
specifies a unique intervention on the relevant variable(s) in intvars
. Each vector contains a function
implementing a particular intervention on a single variable, optionally
followed by one or more "intervention values" (i.e.,
integers used to specify the treatment regime).
List, whose elements are lists of vectors. The kth list in int_times
corresponds to the kth intervention in interventions
. Each vector specifies the time points in which the relevant intervention is applied on the corresponding variable in intvars
.
When an intervention is not applied, the simulated natural course value is used. By default, this argument is set so that all interventions are applied in all time points.
List of vectors. The kth vector specifies the names of the variables for which the kth history function
in histories
is to be applied.
List of length two. The first element is a numeric vector specifying the lags used in the model statements (e.g., if lag1_varname
and lag2_varname
were included in the model statements, this vector would be c(1,2)
). The second element is a numeric vector specifying the lag averages used in the model statements.
Vector of history functions to apply to the variables specified in histvars
.
Logical scalar indicating the presence of a competing event.
List of vectors. Each vector contains the minimum and
maximum values of one of the covariates in covnames
.
Character string specifying the "type" of the outcome. The possible "types" are: "survival"
, "continuous_eof"
, and "binary_eof"
.
Integer specifying the seed for this simulation.
Data table containing the observed data.
Number of time points to simulate.
Logical scalar indicating whether to parallelize simulations of different interventions to multiple cores.
Character string specifying the name of the competing event variable in obs_data
.
Vector of character strings specifying the "type" of each time-varying covariate included in covnames
. The possible "types" are: "binary"
, "normal"
, "categorical"
, "bounded normal"
, "zero-inflated normal"
, "truncated normal"
, "absorbing"
, "categorical time"
, and "custom"
.
List of vectors, where each vector contains information for
one parameter used in the modeling of the time-varying covariates (e.g.,
model statement, family, link function, etc.). Each vector
must be the same length as covnames
and in the same order.
If a parameter is not required for a certain covariate, it
should be set to NA
at that index.
Vector containing custom prediction functions for time-varying
covariates that do not fall within the pre-defined covariate types.
It should be in the same order as covnames
. If a custom
prediction function is not required for a particular
covariate, then that index should be set to NA
.
Vector of character strings specifying the names of baseline covariates in obs_data
.
A vector of one or more values denoting the maximum number of times a binary covariate representing a visit process may be missed before the individual is censored from the data (in the observed data) or a visit is forced (in the simulated data). Multiple values exist in the vector when the modeling of more than covariate is attached to a visit process.
Logical scalar for specifying the convention used for lagi and lag_cumavgi terms in the model statements when pre-baseline times are not
included in obs_data
and when the current time index, \(t\), is such that \(t < i\). If this argument is set to FALSE
, the value
of all lagi and lag_cumavgi terms in this context are set to 0 (for non-categorical covariates) or the reference
level (for categorical covariates). If this argument is set to TRUE
, the value of lagi and lag_cumavgi terms
are set to their values at time 0. The default is FALSE
.
Logical scalar indicating whether the observed data set contains rows for time \(t < 0\).
Numeric scalar specifying lowest value of time \(t\) in the observed data set.
Logical scalar indicating whether to print a progress bar for the number of bootstrap samples completed in the R console. This argument is only applicable when parallel
is set to FALSE
and bootstrap samples are used (i.e., nsamples
is set to a value greater than 0). The default is TRUE
.
Progress bar R6 object. See progress_bar
for further details.
Vector of logicals. The kth element is a logical specifying whether to carry forward the intervened value (rather than the natural value) of the treatment variables(s) when performing a carry forward restriction type for the kth intervention in interventions
.
When the kth element is set to FALSE
, the natural value of the treatment variable(s) in the kth intervention in interventions
will be carried forward.
By default, this argument is set so that the intervened value of the treatment variable(s) is carried forward for all interventions.
Logical scalar indicating whether to truncate simulated covariates to their range in the observed data set. This argument is only applicable for covariates of type "normal"
, "bounded normal"
, "truncated normal"
, and "zero-inflated normal"
.
Other arguments, which are passed to the functions in covpredict_custom
.