run_simulation: Run a simulation with specified configuration

Description

This function runs a complete simulation based on the provided eam_simulation_config object, which is generated by the new_simulation_config function.

Usage

run_simulation(config, output_dir = NULL)

Value

A S3 object of class eam_simulation_output containing the output information

Arguments

config: A eam_simulation_config object containing all simulation parameters, you should use new_simulation_config to create one.
output_dir: The directory to save out-of-core results (optional, will use temp directory if not provided)

Details

This function uses an out-of-core approach to handle potentially large simulation results. Instead of returning a data frame directly, it persists the data to disk and returns an eam_simulation_output object that contains metadata and file system paths.

To access the simulation data, use the following methods on the returned object:

open_dataset() - Returns an Arrow Dataset containing the simulation results, e.g. sim_output$open_dataset()
open_evaluated_conditions() - Returns an Arrow Dataset containing the evaluated condition parameters, e.g. sim_output$open_evaluated_conditions()

Both methods return Arrow Dataset objects rather than data frames, allowing for efficient querying and filtering before loading data into memory. To convert to a data frame, use dplyr::collect() or as.data.frame().

Throughout this package, the eam_simulation_output object is used as the standard parameter for downstream analysis functions, rather than passing Arrow objects or data frames directly.

For multi-item backends, at each discrete time point, only one item can reach the threshold. The precision of this detection depends on the dt parameter. This design choice was made for performance considerations. For almost all experimental scenarios, it is negligible. But users should be aware of this limitation, if it is critical, try to increase the temporal resolution by reducing dt. For implementation details, refer to the backend source code (accumulate_evidence_* functions).

Examples

Run this code

# Define formulas for the simulation
prior_formulas <- list(
  V ~ distributional::dist_uniform(0.1, 1.0),
  ndt ~ 0.3,
  noise_coef ~ 1
)

between_trial_formulas <- list()

item_formulas <- list(
  A_upper ~ 1,
  A_lower ~ -1,
  V ~ V
)

# Define noise factory
noise_factory <- function(context) {
  noise_coef <- context$noise_coef
  function(n, dt) {
    noise_coef * rnorm(n, mean = 0, sd = sqrt(dt))
  }
}

# Create configuration
config <- new_simulation_config(
  prior_formulas = prior_formulas,
  between_trial_formulas = between_trial_formulas,
  item_formulas = item_formulas,
  n_conditions = 10,
  n_trials_per_condition = 10,
  n_items = 5,
  max_reached = 5,
  max_t = 10,
  dt = 0.01,
  noise_mechanism = "add",
  noise_factory = noise_factory,
  model = "ddm",
  parallel = FALSE
)

# Run simulation
sim_output <- run_simulation(config)

# Access results
dataset <- sim_output$open_dataset()
dataset # an arrow dataset object

# if you want to load it into memory, you can use:
df <- as.data.frame(dataset)
head(df)

# Access evaluated condition parameters
cond_dataset <- sim_output$open_evaluated_conditions()
df_cond <- as.data.frame(cond_dataset)
head(df_cond)

Run the code above in your browser using DataLab