regret

Calculates the regret of a policy relative to a benchmark policy.

Provides the infrastructure to define and analyze the solutions of Partially Observable Markov Decision Process (POMDP) models. Interfaces for various exact and approximate solution algorithms are available including value iteration, point-based value iteration and SARSOP. Smallwood and Sondik (1973) <doi:10.1287/opre.21.5.1071>.

Michael Hahsler

pomdp

Infrastructure for Partially Observable Markov Decision
Processes (POMDP)

Hossein Kamalzadeh

regret function

<dl><dt>policy</dt>
<dd>a solved POMDP containing the policy to calculate the regret for.</dd>
<dt>benchmark</dt>
<dd>a solved POMDP with the (optimal) policy. Regret is calculated relative to this
policy.</dd>
<dt>belief</dt>
<dd>the used start belief. If NULL then the start belief of the <code>benchmark</code> is used.</dd></dl>

Arguments

Author

Calculate the Regret of a Policy — regret

<dl>

<dt>policy</dt>
<dd>a solved POMDP containing the policy to calculate the regret for.</dd>


<dt>benchmark</dt>
<dd>a solved POMDP with the (optimal) policy. Regret is calculated relative to this
policy.</dd>


<dt>belief</dt>
<dd>the used start belief. If NULL then the start belief of the <code>benchmark</code> is used.</dd>

</dl>

regret: Calculate the Regret of a Policy

Description

Usage

Value

Arguments

Author

Details

See Also

Examples