Learn R Programming

localLLM

localLLM provides an easy-to-use interface to run large language models (LLMs) directly in R. It uses the performant llama.cpp library as the backend and allows you to generate text and analyze data with LLM. Everything runs locally on your own machine, completely free. It also ensures reproducibility by default. Our goal is to develop it into a reliable toolkit for scientific research.

Tutorial: https://www.eddieyang.net/software/localLLM


Installation

Getting started requires two simple steps: installing the R package from CRAN and then downloading the backend C++ library that handles the heavy computations. The install_localLLM() function automatically detects your operating system (Windows, macOS, Linux) to download the appropriate pre-compiled library.

# 1. Install the R package from CRAN
install.packages("localLLM")

# 2. Load the package and install the backend library
library(localLLM)
install_localLLM()

Quick Start

You can start running an LLM using quick_llama().

library(localLLM)

# Ask a question and get a response
response <- quick_llama('Classify whether the sentiment of the tweet is Positive
  or Negative.\n\nTweet: "This paper is amazing! I really like it."')

cat(response) # Output: The sentiment of this tweet is Positive.

Reproducibility

By default , all generation functions in localLLM (quick_llama(), generate(), and generate_parallel()) use deterministic greedy decoding. Even when temperature > 0, results are reproducibile.

response1 <- quick_llama('Classify whether the sentiment of the tweet is Positive
  or Negative.\n\nTweet: "This paper is amazing! I really like it."', 
  temperature=0.9, seed=92092)

response2 <- quick_llama('Classify whether the sentiment of the tweet is Positive
  or Negative.\n\nTweet: "This paper is amazing! I really like it."', 
  temperature=0.9, seed=92092)

print(response1==response2)

Report bugs

Please report bugs to xu2009@purdue.edu with your sample code and data file. Much appreciated!

Copy Link

Version

Install

install.packages('localLLM')

Monthly Downloads

256

Version

1.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Yaosheng Xu

Last Published

December 17th, 2025

Functions in localLLM (1.1.0)

explore

Compare multiple LLMs over a shared set of prompts
install_localLLM

Install localLLM Backend Library
document_start

Start automatic run documentation
.with_hf_token

Temporarily apply an HF token for a scoped operation
localLLM-package

R Interface to llama.cpp with Runtime Library Loading
list_cached_models

List cached models on disk
list_ollama_models

List GGUF models managed by Ollama
model_load

Load Language Model with Automatic Download Support
generate_parallel

Generate Text in Parallel for Multiple Prompts
generate

Generate Text Using Language Model Context
set_hf_token

Configure Hugging Face access token
smart_chat_template

Smart Chat Template Application
tokenize

Convert Text to Token IDs
tokenize_test

Test tokenize function (debugging)
intercoder_reliability

Intercoder reliability for LLM annotations
get_lib_path

Get Backend Library Path
validate

Validate model predictions against gold labels and peer agreement
get_model_cache_dir

Get the model cache directory
lib_is_installed

Check if Backend Library is Installed
quick_llama_reset

Reset quick_llama state
quick_llama

Quick LLaMA Inference
apply_chat_template

Apply Chat Template to Format Conversations
apply_gemma_chat_template

Apply Gemma-Compatible Chat Template
backend_free

Free localLLM backend
document_end

Finish automatic run documentation
detokenize

Convert Token IDs Back to Text
annotation_sink_csv

Create a CSV sink for streaming annotation chunks
backend_init

Initialize localLLM backend
context_create

Create Inference Context for Text Generation
ag_news_sample

AG News classification sample
compute_confusion_matrices

Compute confusion matrices from multi-model annotations
hardware_profile

Inspect detected hardware resources
download_model

Download a model manually