Learn R Programming

openaiRtools (version 0.2.2)

multimodal: Helper Functions for Multimodal Content

Description

Most modern LLMs (GPT-4o, Claude 3, Gemini, Qwen-VL, etc.) support multimodal input — you can send both text and images in the same message. Images are embedded inside the content field of a "user" message as a list of content parts.

There are three ways to provide an image:

  1. Image URL — the model downloads it directly (image_from_url)

  2. Local file — read and Base64-encoded automatically (image_from_file)

  3. R plot — save a ggplot2 / base R figure and send it (image_from_plot)

Use create_multimodal_message to combine text + multiple images into a single ready-to-use message object.

Arguments

Details

Functions to construct image content objects for sending images to vision-capable LLMs via the Chat Completions API.