Learn R Programming

quallmer (version 0.4.0)

qlm_code: Code qualitative data with an LLM

Description

Applies a codebook to input data using a large language model, returning a rich object that includes the codebook, execution settings, results, and metadata for reproducibility.

Usage

qlm_code(x, codebook, model, ..., batch = FALSE, name = NULL, notes = NULL)

Value

A qlm_coded object (a tibble with additional attributes):

Data columns

The coded results with a .id column for identifiers.

Attributes

data, input_type, and run (list containing name, batch, call, codebook, chat_args, execution_args, metadata, parent).

The object prints as a tibble and can be used directly in data manipulation workflows. The batch flag in the run attribute indicates whether batch processing was used. The execution_args contains all non-chat execution arguments (for either parallel or batch processing).

Arguments

x

Input data: a character vector of texts (for text codebooks) or file paths to images (for image codebooks). Named vectors will use names as identifiers in the output; unnamed vectors will use sequential integers.

codebook

A codebook object created with qlm_codebook(). Also accepts deprecated task() objects for backward compatibility.

model

Provider (and optionally model) name in the form "provider/model" or "provider" (which will use the default model for that provider). Passed to the name argument of ellmer::chat(). Examples: "openai/gpt-4o-mini", "anthropic/claude-3-5-sonnet-20241022", "ollama/llama3.2", "openai" (uses default OpenAI model).

...

Additional arguments passed to ellmer::chat(), ellmer::parallel_chat_structured(), or ellmer::batch_chat_structured(). Arguments recognized by ellmer::parallel_chat_structured() or ellmer::batch_chat_structured() are routed there; all other arguments (including provider-specific arguments like base_url, credentials, or api_args for OpenAI-compatible endpoints) are passed to ellmer::chat().

batch

Logical. If TRUE, uses ellmer::batch_chat_structured() instead of ellmer::parallel_chat_structured(). Batch processing is more cost-effective for large jobs but may have longer turnaround times. Default is FALSE. See ellmer::batch_chat_structured() for details.

name

Character string identifying this coding run. Default is NULL.

notes

Optional character string with descriptive notes about this coding run. Useful for documenting the purpose or rationale when viewing results in qlm_trail(). Default is NULL.

Details

Arguments in ... are dynamically routed to either ellmer::chat(), ellmer::parallel_chat_structured(), or ellmer::batch_chat_structured() based on their names.

Progress indicators and error handling are provided by the underlying ellmer::parallel_chat_structured() or ellmer::batch_chat_structured() function. Set verbose = TRUE to see progress messages during coding. Retry logic for API failures should be configured through ellmer's options.

When batch = TRUE, the function uses ellmer::batch_chat_structured() which submits jobs to the provider's batch API. This is typically more cost-effective but has longer turnaround times. The path argument specifies where batch results are cached, wait controls whether to wait for completion, and ignore_hash can force reprocessing of cached results.

See Also

qlm_codebook() for creating codebooks, qlm_replicate() for replicating coding runs, qlm_compare() and qlm_validate() for assessing reliability.

Examples

Run this code
# Requires API credentials and internet access; not run in package checks.
if (FALSE) {
# Basic sentiment analysis
texts <- c("I love this product!", "Terrible experience.", "It's okay.")
coded <- qlm_code(texts, data_codebook_sentiment, model = "openai/gpt-4o-mini")
coded

# With named inputs (names become IDs in output)
texts_named <- c(review1 = "Great service!", review2 = "Very disappointing.")
coded2 <- qlm_code(texts_named, data_codebook_sentiment, model = "openai/gpt-4o-mini")
coded2
}

Run the code above in your browser using DataLab