call_llm_par: Parallel LLM Processing with Tibble-Based Experiments (Core Engine)

Description

Processes experiments from a tibble where each row contains a config and message pair. This is the core parallel processing function. Metadata columns are preserved. This function requires setting up the parallel environment using `setup_llm_parallel`.

Usage

call_llm_par(
  experiments,
  simplify = TRUE,
  tries = 10,
  wait_seconds = 2,
  backoff_factor = 2,
  verbose = FALSE,
  memoize = FALSE,
  max_workers = NULL,
  progress = FALSE,
  json_output = NULL
)

Value

A tibble containing all original columns from experiments (metadata, config, messages), plus new columns: response_text, raw_response_json (the raw JSON string from the API), success, error_message, duration (in seconds).

Arguments

experiments: A tibble/data.frame with required list-columns 'config' (llm_config objects) and 'messages' (message lists). Additional columns are treated as metadata and preserved.
simplify: Whether to cbind 'experiments' to the output data frame or not.
tries: Integer. Number of retries for each call. Default is 10.
wait_seconds: Numeric. Initial wait time (seconds) before retry. Default is 2.
backoff_factor: Numeric. Multiplier for wait time after each failure. Default is 2.
verbose: Logical. If TRUE, prints progress and debug information.
memoize: Logical. If TRUE, enables caching for identical requests.
max_workers: Integer. Maximum number of parallel workers. If NULL, auto-detects.
progress: Logical. If TRUE, shows progress bar.
json_output: Deprecated. Raw JSON string is always included as raw_response_json. This parameter is kept for backward compatibility but has no effect.

Examples

Run this code

if (FALSE) {
  library(dplyr)
  library(tidyr)

  # Build experiments with expand_grid
  experiments <- expand_grid(
    condition = c("control", "treatment"),
    model_type = c("small", "large"),
    rep = 1:10
  ) |>
    mutate(
      config = case_when(
        model_type == "small" ~ list(small_config),
        model_type == "large" ~ list(large_config)
      ),
      messages = case_when(
        condition == "control" ~ list(control_messages),
        condition == "treatment" ~ list(treatment_messages)
      )
    )

  setup_llm_parallel(workers = 4)
  results <- call_llm_par(experiments, progress = TRUE)
  reset_llm_parallel()

  # All metadata preserved for analysis
  results |>
    group_by(condition, model_type) |>
    summarise(mean_response = mean(as.numeric(response_text), na.rm = TRUE))
}

Run the code above in your browser using DataLab