The policy can afterwards be received using functions getPolicy
and getPolicyW
.
runPolicyIteDiscount(
mdp,
w,
dur,
rate = 0,
rateBase = 1,
discountFactor = NULL,
maxIte = 100,
discountMethod = "continuous",
getLog = TRUE
)
Nothing.
The MDP loaded using loadMDP()
.
The label of the weight we optimize.
The label of the duration/time such that discount rates can be calculated.
The interest rate.
The time-horizon the rate is valid over.
The discount rate for one time unit. If specified rate
and rateBase
are not used to calculate the discount rate.
Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop.
Either 'continuous' or 'discrete', corresponding to discount factor exp(-rate/rateBase)
or 1/(1 + rate/rateBase)
, respectively. Only used if discountFactor
is NULL
.
Output the log messages.
getPolicy()
.