runPolicyIteDiscount

The policy can afterwards be received using functions <code>getPolicy</code> and <code>getPolicyW</code>.

Create and optimize (semi) MDPs with discrete time steps and
state space. Both hierarchical and ordinary-traditional MDPs can be
modeled.

Lars Relund

MDP2

Markov Decision Processes (MDPs)

Lars Relund Nielsen

runPolicyIteDiscount function

<dl><dt>mdp</dt>
<dd>The MDP loaded using <code>loadMDP()</code>.</dd>
<dt>w</dt>
<dd>The label of the weight we optimize.</dd>
<dt>dur</dt>
<dd>The label of the duration/time such that discount rates can be calculated.</dd>
<dt>rate</dt>
<dd>The interest rate.</dd>
<dt>rateBase</dt>
<dd>The time-horizon the rate is valid over.</dd>
<dt>discountFactor</dt>
<dd>The discount rate for one time unit. If specified <code>rate</code> and <code>rateBase</code> are not used to calculate the discount rate.</dd>
<dt>maxIte</dt>
<dd>Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop.</dd>
<dt>discountMethod</dt>
<dd>Either 'continuous' or 'discrete', corresponding to discount factor <code>exp(-rate/rateBase)</code> or <code>1/(1 + rate/rateBase)</code>, respectively. Only used if <code>discountFactor</code> is <code>NULL</code>.</dd>
<dt>getLog</dt>
<dd>Output the log messages.</dd></dl>

Arguments

Perform policy iteration (discounted reward criterion) on the MDP. — runPolicyIteDiscount

<dl>

<dt>mdp</dt>
<dd>The MDP loaded using <code>loadMDP()</code>.</dd>


<dt>w</dt>
<dd>The label of the weight we optimize.</dd>


<dt>dur</dt>
<dd>The label of the duration/time such that discount rates can be calculated.</dd>


<dt>rate</dt>
<dd>The interest rate.</dd>


<dt>rateBase</dt>
<dd>The time-horizon the rate is valid over.</dd>


<dt>discountFactor</dt>
<dd>The discount rate for one time unit. If specified <code>rate</code> and <code>rateBase</code> are not used to calculate the discount rate.</dd>


<dt>maxIte</dt>
<dd>Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop.</dd>


<dt>discountMethod</dt>
<dd>Either 'continuous' or 'discrete', corresponding to discount factor <code>exp(-rate/rateBase)</code> or <code>1/(1 + rate/rateBase)</code>, respectively. Only used if <code>discountFactor</code> is <code>NULL</code>.</dd>


<dt>getLog</dt>
<dd>Output the log messages.</dd>

</dl>

runPolicyIteDiscount: Perform policy iteration (discounted reward criterion) on the MDP.

Description

Usage

Value

Arguments

See Also