benchmark_agent

Run a benchmark suite against an agent and collect performance metrics.

A production-grade AI toolkit for R featuring a layered
architecture (Specification, Utilities, Providers, Core), request
interception support, robust error handling with exponential retry
delays, support for multiple AI model providers ('OpenAI',
'Anthropic', etc.), local small language model inference,
distributed 'MCP' ecosystem, multi-agent orchestration, progressive
knowledge loading through skills, and a global skill store for
sharing AI capabilities.

Yonghe Xia

aisdk

Unified Interface for AI Model Providers

benchmark_agent function

<dl><dt>agent</dt>
<dd>An Agent object or model string.</dd>
<dt>tasks</dt>
<dd>A list of benchmark tasks (see details).</dd>
<dt>tools</dt>
<dd>Optional list of tools for the agent.</dd>
<dt>verbose</dt>
<dd>Print progress.</dd></dl>

Arguments

Benchmark Agent — benchmark_agent

<dl>

<dt>agent</dt>
<dd>An Agent object or model string.</dd>


<dt>tasks</dt>
<dd>A list of benchmark tasks (see details).</dd>


<dt>tools</dt>
<dd>Optional list of tools for the agent.</dd>


<dt>verbose</dt>
<dd>Print progress.</dd>

</dl>

benchmark_agent: Benchmark Agent

Description

Usage

Value

Arguments

Details