Stores information necessary to simulate and visualize datasets based
on underlying distribution Z
.
simdesign(
generator,
transform_initial = base::identity,
n_var_final = -1,
types_final = NULL,
names_final = NULL,
prefix_final = "v",
process_final = list(),
name = "Simulation design",
check_and_infer = TRUE,
...
)
List object with class attribute "simdesign" (S3 class) containing the following entries (if no further information given, entries are directly saved from user input):
generator
name
transform_initial
n_var_final
types_final
names_final
process_final
entries for further information as passed by the user
Function which generates data from the underlying base distribution. It is
assumed it takes the number of simulated observations n_obs
as first
argument, as all random generation functions in the stats and
extraDistr do. Furthermore, it is expected to return a two-dimensional
array as output (matrix or data.frame). Alternatively an R object derived
from the simdata::simdesign
class. See details.
Function which specifies the transformation of the underlying
dataset Z
to final dataset X
. See details.
Integer, number of columns in final datamatrix X
. Can be inferred when
check_and_infer
is TRUE.
Optional vector of length equal to n_var_final
(set by the user or
inferred) and hence number of columns of final dataset X
.
Allowed entries are "logical", "factor" and "numeric".
Stores the type of the columns of X
.
If not specified by, inferred if check_and_infer
is set to TRUE.
NULL or character vector with variable names for final dataset X
.
Length needs to equal the number of columns of X
.
Overrides other naming options. See details.
NULL or prefix attached to variables in final dataset X
. Overriden
by names_final
argument. Set to NULL if no prefixes should
be added. See details.
List of lists specifying post-processing functions applied to final
datamatrix X
before returning it. See do_processing
.
Character, optional name of the simulation design.
If TRUE, then the simulation design is tested by simulating 5 observations
using simulate_data
. If everything works without error,
the variables n_var_final
and types_final
will be inferred
from the results if not already set correctly by the user.
Further arguments are directly stored in the list object to be passed to
simulate_data
.
If check_and_infer
is set to TRUE, the following procedure determines
the names of the variables:
use names_final
if specified and of correct length
otherwise, use the names of transform_initial
if present and of
correct length
otherwise, use prefix_final
to prefix the variable number if
not NULL
otherwise, use names from dataset as generated by the generator
function
This class is intended to be used as a template for simulation designs
which are based on specific underlying distributions. All such a template
needs to define is the generator
function and its construction and
pass it to this function along with the other arguments. See
simdesign_mvtnorm
for an example.
The simdesign
class should be used in the following workflow:
Specify a design template which will be used in subsequent data generating / visualization steps.
Sample / visualize datamatrix following template (possibly
multiple times) using simulate_data
.
Use sampled datamatrix for simulation study.
For more details on generators and transformations, please see the
documentation of simulate_data
.
For details on post-processing, please see the documentation of
do_processing
.
simdesign_mvtnorm
,
simulate_data
,
simulate_data_conditional
generator <- function(n) mvtnorm::rmvnorm(n, mean = 0)
sim_design <- simdesign(generator)
simulate_data(sim_design, 10, seed = 19)
Run the code above in your browser using DataLab