add_image: Add an image to a tidyprompt (multimodal)

Description

Attach an image to a tidyprompt() for use with multimodal LLMs.

Supports 'ollama', 'openai' (completions & responses) and 'ellmer'-backed providers. Can convert from and to 'ellmer' content image objects as needed.

Usage

add_image(
  prompt,
  image,
  alt = NULL,
  detail = c("auto", "low", "high"),
  mime = NULL
)

Value

A tidyprompt() with an added prompt_wrap() which will attach an image to the prompt for use with multimodal LLMs

Arguments

prompt

A single string or a tidyprompt() object

image

An image reference. One of:

a local file path (e.g., "path/to/image.png")
a URL (e.g., "https://.../image.jpg")
a base64 string (optionally with data URL prefix)
a raw vector of bytes
a plot object (e.g., base recordedplot, ggplot, or grid grob) to be rasterized automatically
an 'ellmer' content object created by ellmer::content_image_url(), ellmer::content_image_file(), or ellmer::content_image_plot() (this will work with both regular providers and 'ellmer'-backed providers)#' For OpenAI Responses API, URLs must point directly to an image resource (not an HTML page) and are transmitted as a scalar string image_url with optional detail. Supplying a webpage URL (e.g. a Wikipedia media viewer link) will result in a provider 400 error expecting an image URL string

alt

Optional alternative text/alt description

detail

Detail hint for some providers (OpenAI): one of "auto", "low", "high"

mime

Optional mime-type if providing raw/base64 without data URL (e.g., "image/png")

Examples

Run this code

# Create a prompt with a remote image (web URL)
image_prompt <- "What is shown in this image?" |>
  add_image("https://upload.wikimedia.org/wikipedia/commons/3/3a/Cat03.jpg")

# Create a prompt with a local image (file path)
# First save an image to a temporary file
cat_img_file <- tempfile(fileext = ".jpg")
download.file(
  "https://upload.wikimedia.org/wikipedia/commons/3/3a/Cat03.jpg",
  destfile = cat_img_file,
  mode = "wb"
)
# Then build prompt with local image
local_image_prompt <- "What is shown in this image?" |>
  add_image(cat_img_file)

# Send prompt to different LLM providers
# (example is not run because it requires configured LLM providers)
if (FALSE) {
  # OpenAI-compatible
  send_prompt(image_prompt, llm_provider_openai(parameters = list(model = "gpt-4o-mini")))
  # --- Sending request to LLM provider (gpt-4o-mini): ---
  # What is shown in this image?
  # --- Receiving response from LLM provider: ---
  # The image shows a close-up of an orange tabby cat, characterized by its
  # striped fur and distinctive golden eyes. The background appears blurred,
  # suggesting a softly focused environment.

  # Ollama-compatible
  send_prompt(image_prompt, llm_provider_ollama(parameters = list(model = "qwen3-vl:2b")))
  # ...

  # 'ellmer'-compatible
  send_prompt(image_prompt, llm_provider_ellmer(ellmer::chat_openai(model = "gpt-4o-mini")))
  # ...
}

# Create a prompt with a plot (e.g., 'ggplot2' plot)
if (FALSE) {
if (requireNamespace("ggplot2", quietly = TRUE)) {
  plot <- ggplot2::ggplot(mtcars, ggplot2::aes(mpg, disp)) +
    ggplot2::geom_point()

  plot_prompt <- "Describe this plot" |>
    add_image(plot)

  send_prompt(plot_prompt, llm_provider_openai())
  # --- Sending request to LLM provider (gpt-4o-mini): ---
  # Describe this plot
  # --- Receiving response from LLM provider: ---
  # The plot is a scatter plot depicting the relationship between two variables:
  # "mpg" (miles per gallon) on the x-axis and "disp" (displacement) on
  # the y-axis. (...)
}
}

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples