runPolicyIteAve

The policy can afterwards be received using functions <code>getPolicy</code> and <code>getPolicyW</code>.

Create and optimize (semi) MDPs with discrete time steps and
state space. Both hierarchical and ordinary-traditional MDPs can be
modeled.

Lars Relund

MDP2

Markov Decision Processes (MDPs)

Lars Relund Nielsen

runPolicyIteAve function

<dl><dt>mdp</dt>
<dd>The MDP loaded using <code>loadMDP()</code>.</dd>
<dt>w</dt>
<dd>The label of the weight we optimize.</dd>
<dt>dur</dt>
<dd>The label of the duration/time such that discount rates can be calculated.</dd>
<dt>maxIte</dt>
<dd>Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop.</dd>
<dt>getLog</dt>
<dd>Output the log messages.</dd></dl>

Arguments

Perform policy iteration (average reward criterion) on the MDP. — runPolicyIteAve

<dl>

<dt>mdp</dt>
<dd>The MDP loaded using <code>loadMDP()</code>.</dd>


<dt>w</dt>
<dd>The label of the weight we optimize.</dd>


<dt>dur</dt>
<dd>The label of the duration/time such that discount rates can be calculated.</dd>


<dt>maxIte</dt>
<dd>Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop.</dd>


<dt>getLog</dt>
<dd>Output the log messages.</dd>

</dl>

runPolicyIteAve: Perform policy iteration (average reward criterion) on the MDP.

Description

Usage

Value

Arguments

See Also