edge_benchmark

Test inference speed and throughput with the current model to measure
the effectiveness of optimizations.

Enables R users to run large language models locally using 'GGUF' model files
and the 'llama.cpp' inference engine. Provides a complete R interface for loading models,
generating text completions, and streaming responses in real-time. Supports local
inference without requiring cloud APIs or internet connectivity, ensuring complete
data privacy and control. Based on the 'llama.cpp' project by Georgi Gerganov (2023) <https://github.com/ggml-org/llama.cpp>.

Pawan Rama Mali

edgemodelr

Local Large Language Model Inference Engine

Georgi Gerganov

The ggml authors 

Jeffrey Quesnelle

Bowen Peng

pi6am 

Ivan Yurchenko

Dirk Eddelbuettel

edge_benchmark function

<dl><dt>ctx</dt>
<dd>Model context from edge_load_model()</dd>
<dt>prompt</dt>
<dd>Test prompt to use for benchmarking (default: standard test)</dd>
<dt>n_predict</dt>
<dd>Number of tokens to generate for the test</dd>
<dt>iterations</dt>
<dd>Number of test iterations to average results</dd></dl>

Arguments

Performance benchmarking for model inference — edge_benchmark

<dl>

<dt>ctx</dt>
<dd>Model context from edge_load_model()</dd>


<dt>prompt</dt>
<dd>Test prompt to use for benchmarking (default: standard test)</dd>


<dt>n_predict</dt>
<dd>Number of tokens to generate for the test</dd>


<dt>iterations</dt>
<dd>Number of test iterations to average results</dd>

</dl>

edge_benchmark: Performance benchmarking for model inference

Description

Usage

Value

Arguments

Examples