Learn R Programming

localLLM (version 1.3.0)

generate_parallel: Generate Text in Parallel for Multiple Prompts

Description

Generate Text in Parallel for Multiple Prompts

Usage

generate_parallel(
  context,
  prompts,
  max_tokens = 100L,
  top_k = 40L,
  top_p = 1,
  temperature = 0,
  repeat_last_n = 0L,
  penalty_repeat = 1,
  seed = 1234L,
  progress = interactive(),
  verbosity = 0L,
  clean = FALSE,
  hash = TRUE
)

Value

Character vector of generated texts

Arguments

context

A context object created with context_create

prompts

Character vector of input text prompts

max_tokens

Maximum number of tokens to generate (default: 100)

top_k

Top-k sampling parameter (default: 40). Limits vocabulary to k most likely tokens

top_p

Top-p (nucleus) sampling (default: 1.0). Probability threshold for token selection.

temperature

Sampling temperature (default: 0.0). Set to 0 for greedy decoding. Higher values increase creativity

repeat_last_n

Number of recent tokens to consider for repetition penalty (default: 0). Set to 0 to disable

penalty_repeat

Repetition penalty strength (default: 1.0). Values >1 discourage repetition. Set to 1.0 to disable

seed

Random seed for reproducible generation (default: 1234). Use positive integers for deterministic output

progress

Show a console progress bar while batches run. Defaults to interactive(): visible in interactive sessions, suppressed in scripts and R CMD check.

verbosity

Control backend logging during generation (default: 0L). Larger numbers print more detail: 0 shows only errors, 1 adds warnings, 2 prints informational messages, and 3 enables the most verbose debug output. Negative values fully suppress backend output. Defaults to quiet (0) so that only the progress bar is visible during typical batch runs, matching generate. This differs from model_load and context_create (default 1L), which run once per session and benefit from warnings being visible. Raise to 2L or 3L when debugging llama.cpp internals.

clean

If TRUE, remove common chat-template control tokens from each generated text (default: FALSE).

hash

When `TRUE` (default), computes SHA-256 hashes for the supplied prompts and generated outputs. Hashes are attached via the `"hashes"` attribute for later inspection.

Details

When more prompts are supplied than the context can hold in parallel (`n_seq_max - 1`), the function automatically processes them in sequential batches while preserving the original ordering of results.