Learn R Programming

kuzco

{kuzco} is a simple vision boilerplate built for ollama in R, on top of {ollamar} & {ellmer}. {kuzco} is designed as a computer vision assistant, giving local models guidance on classifying images and return structured data. The goal is to standardize outputs for image classification and use LLMs as an alternative option to keras or torch.

{kuzco} currently supports: classification, recognition, sentiment, text extraction, alt-text creation, and custom computer vision tasks.

Installation

You can install the development version of kuzco like so:

devtools::install_github("frankiethull/kuzco")

kuzco 0.1.0 can be installed from CRAN via install.packages("kuzco")!

Example

This is a basic example which shows you how to use kuzco.

library(kuzco)
library(ollamar)

here we have an image and want to learn about it:

test_img <- file.path(system.file(package = "kuzco"), "img/test_img.jpg") 

llm for image classification:

llm_results <- llm_image_classification(llm_model = "qwen2.5vl", image = test_img)
llm_results |> tibble::as_tibble()
#> # A tibble: 1 × 7
#>   image_classification primary_object secondary_object image_description        
#>   <chr>                <chr>          <chr>            <chr>                    
#> 1 animal portrait      puppy          ""               A close-up portrait of a…
#> # ℹ 3 more variables: image_colors <chr>, image_proba_names <chr>,
#> #   image_proba_values <chr>
llm_results |> str()
#> tibble [1 × 7] (S3: tbl_df/tbl/data.frame)
#>  $ image_classification: chr "animal portrait"
#>  $ primary_object      : chr "puppy"
#>  $ secondary_object    : chr ""
#>  $ image_description   : chr "A close-up portrait of a fluffy, curious-looking puppy with a striking patch on its head. The puppy has a white"| __truncated__
#>  $ image_colors        : chr "The image has a palette with shades of white, black, and hints of gray."
#>  $ image_proba_names   : chr "puppy, fur texture, eye, coat"
#>  $ image_proba_values  : chr "[0.85, 0.10, 0.05, 0.05]"

llm for image sentiment:

llm_emotion <- llm_image_sentiment(llm_model = "qwen2.5vl", image = test_img)

llm_emotion |> str()
#> tibble [1 × 4] (S3: tbl_df/tbl/data.frame)
#>  $ image_sentiment      : chr "positive"
#>  $ image_score          : num 0.8
#>  $ sentiment_description: chr "The soft, warm lighting and the cute features of the puppy create a feeling of happiness and warmth."
#>  $ image_keywords       : chr "cute, friendly, playful, adorable, lovable"

llm for image recognition:

note that the backend of kuzco is flexible as well. This allows users to specify between ‘ollamar’, which suggests structured outputs, while ‘ellmer’ enforces structured outputs.

llm_detection <- llm_image_recognition(llm_model = "qwen2.5vl", 
                                       image = test_img,
                                       recognize_object = "nose")

llm_detection |> str()
#> tibble [1 × 4] (S3: tbl_df/tbl/data.frame)
#>  $ object_recognized : chr "TRUE"
#>  $ object_count      : int 1
#>  $ object_description: chr "A black and white puppy nose, slightly pink inside with dark round nostrils."
#>  $ object_location   : chr "center"

llm for image text extraction:

kuzco is also useful for OCR tasks, extracting text from images is showcased below:

text_img <- file.path(system.file(package = "kuzco"), "img/text_img.jpg") 

text_img |> view_image()
llm_extract_txt <- llm_image_extract_text(llm_model = "qwen2.5vl", 
                                          image = text_img,
                                          backend  = "ellmer")

llm_extract_txt |> str()
#> tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ text            : chr "Picture of Odin\nas a puppy\ncirca Q4 2019"
#>  $ confidence_score: num 0.99

newer features

llm image customization:

a new feature in kuzco, is a fully customizable function. This allows users to further test computer vision techniques without leaving the kuzco boilerplate.

llm_customized <- llm_image_custom(llm_model = "qwen2.5vl", 
                                   image = test_img,
                                   system_prompt = "you are a dog breed expert, you know all about dogs. 
                                                    tell me the primary breed, secondary breed, and a brief description about both.",
                                   image_prompt  = "tell me what kind of dog is in the image?",
                                   example_df = data.frame(
                                     dog_breed_primary = "hound",
                                     dog_breed_secondary = "corgi",
                                     dog_breed_information = "information about the primary and secondary breed"
                                   ))

llm_customized |> str()
#> 'data.frame':    1 obs. of  3 variables:
#>  $ dog_breed_primary    : chr "terrier"
#>  $ dog_breed_secondary  : chr "spotted"
#>  $ dog_breed_information: chr "The primary breed is likely a terrier based on the facial features and compact size. The secondary breed is 'sp"| __truncated__

additional enhancements:

i/o helpers

kuzco now has view_image & view_llm_results functions within the package, making it easy to view images and display llm results. In addition to this, kuzco now features kuzco_app a fully functioning shiny application within the package. Making it even easier to do computer vision with LLMs in R.

cloud-based LLMs

kuzco now supports all LLM providers that are supported by ellmer! That’s correct, you can now send images to Perplexity, Claude, OpenAI, Gemini, the list goes on. This defaults to “ollama” to maintain the original workflows.

Cloud-hosted LLMs generally offer greater speed and more advanced capabilities, but require users to obtain an API key since inference is handled remotely. While some providers offer a free tier with usage limits, others do not. Keep in mind that using a cloud-hosted LLM comes with less privacy compared to running a model locally, but it enables access to powerful, cutting-edge models. To get started, users should set up their API key in their environment and select a provider-hosted model that supports image processing.

A mistral example below using pixtral-12b, which is still a pretty small model. But leverages mistral’s compute, instead of yours.

# via base R:
Sys.setenv(MISTRAL_API_KEY = "the_api_key_via_the_provider")
# or usethis:
usethis::edit_r_environ()
kuzco::llm_image_classification(provider = "mistral", llm_model = "pixtral-12b", image = test_img)

Copy Link

Version

Install

install.packages('kuzco')

Version

0.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Frank Hull

Last Published

January 26th, 2026

Functions in kuzco (0.1.0)

kuzco-package

kuzco: Computer Vision with Large Language Models
list_prompts

list prompts
kuzco_app

shiny kuzco app
llm_image_classification

Image Classification using LLMs
view_llm_results

view llm results as a tidy great table
view_image

View Images quickly and easily
llm_image_sentiment

Image Sentiment using LLMs
llm_image_extract_text

Image OCR for Text Extraction using LLMs
edit_prompt

edit prompt
llm_image_custom

Customized Vision using LLMs
llm_image_alt_text

Image Alt Text using LLMs
chat_ellmer

chat ellmer helper (predates ellmer::chat)
llm_image_recognition

Image Recognition using LLMs