Generate Text in Parallel for Multiple Prompts
generate_parallel(
context,
prompts,
max_tokens = 100L,
top_k = 40L,
top_p = 1,
temperature = 0,
repeat_last_n = 0L,
penalty_repeat = 1,
seed = 1234L,
progress = interactive(),
verbosity = 0L,
clean = FALSE,
hash = TRUE
)Character vector of generated texts
A context object created with context_create
Character vector of input text prompts
Maximum number of tokens to generate (default: 100)
Top-k sampling parameter (default: 40). Limits vocabulary to k most likely tokens
Top-p (nucleus) sampling (default: 1.0). Probability threshold for token selection.
Sampling temperature (default: 0.0). Set to 0 for greedy decoding. Higher values increase creativity
Number of recent tokens to consider for repetition penalty (default: 0). Set to 0 to disable
Repetition penalty strength (default: 1.0). Values >1 discourage repetition. Set to 1.0 to disable
Random seed for reproducible generation (default: 1234). Use positive integers for deterministic output
Show a console progress bar while batches run. Defaults to
interactive(): visible in interactive sessions, suppressed in
scripts and R CMD check.
Control backend logging during generation (default: 0L).
Larger numbers print more detail: 0 shows only errors, 1
adds warnings, 2 prints informational messages, and 3
enables the most verbose debug output. Negative values fully suppress
backend output. Defaults to quiet (0) so that only the progress
bar is visible during typical batch runs, matching generate.
This differs from model_load and context_create
(default 1L), which run once per session and benefit from warnings
being visible. Raise to 2L or 3L when debugging llama.cpp
internals.
If TRUE, remove common chat-template control tokens from each generated text (default: FALSE).
When `TRUE` (default), computes SHA-256 hashes for the supplied prompts and generated outputs. Hashes are attached via the `"hashes"` attribute for later inspection.
When more prompts are supplied than the context can hold in parallel (`n_seq_max - 1`), the function automatically processes them in sequential batches while preserving the original ordering of results.