Learn R Programming

MDP2 (version 2.1.2)

runPolicyIteAve: Perform policy iteration (average reward criterion) on the MDP.

Description

The policy can afterwards be received using functions getPolicy and getPolicyW.

Usage

runPolicyIteAve(mdp, w, dur, maxIte = 100, getLog = TRUE)

Value

The optimal gain (g) calculated.

Arguments

mdp

The MDP loaded using loadMDP().

w

The label of the weight we optimize.

dur

The label of the duration/time such that discount rates can be calculated.

maxIte

Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop.

getLog

Output the log messages.

See Also

getPolicy().