Learn R Programming

Markov Decision Processes (MDPs) in R

The MDP2 package in R is a package for solving Markov decision processes (MDPs) with discrete time-steps, states and actions. Both traditional MDPs (Puterman 1994), semi-Markov decision processes (semi-MDPs) (Tijms 2003) and hierarchical-MDPs (HMDPs) (Kristensen and Jørgensen 2000) can be solved under a finite and infinite time-horizon.

Building and solving an MDP is done in two steps. First, the MDP is built and saved in a set of binary files. Next, you load the MDP into memory from the binary files and apply various algorithms to the model.

The package implement well-known algorithms such as policy iteration and value iteration under different criteria e.g. average reward per time unit and expected total discounted reward. The model is stored using an underlying data structure based on the state-expanded directed hypergraph of the MDP (Nielsen and Kristensen (2006)) implemented in C++ for fast running times.

Installation

Install the latest stable release from CRAN:

install.packages("MDP2")

Alternatively, install the latest development version from GitHub (recommended):

remotes::install_github("relund/mdp")

We load the package using

library(MDP2)

Help about the package can be seen by writing

?MDP2

To illustrate the package capabilities, we use a few examples, namely, an infinite and finite-horizon semi-MDP and a HMDP. Before each example a short introduction to these models are given.

Learning more

To get started, first read vignette("MDP2").

For more examples see example("MDP2").

References

Kristensen, A. R., and E. Jørgensen. 2000. “Multi-Level Hierarchic Markov Processes as a Framework for Herd Management Support.” Annals of Operations Research 94: 69–89. https://doi.org/10.1023/A:1018921201113.

Nielsen, L. R., and A. R. Kristensen. 2006. “Finding the $K$ Best Policies in a Finite-Horizon Markov Decision Process.” European Journal of Operational Research 175 (2): 1164–79. https://doi.org/10.1016/j.ejor.2005.06.011.

Puterman, M. L. 1994. Markov Decision Processes. Wiley Series in Probability and Mathematical Statistics. Wiley-Interscience.

Tijms, Henk. C. 2003. A First Course in Stochastic Models. John Wiley & Sons Ltd.

Copy Link

Version

Install

install.packages('MDP2')

Monthly Downloads

5

Version

2.1.2

License

GPL (>= 3.3.2)

Issues

Pull Requests

Stars

Forks

Maintainer

Lars Relund

Last Published

January 31st, 2023

Functions in MDP2 (2.1.2)

MDP2-package

MDP2: Markov Decision Processes (MDPs)
actionInfo

Info about the actions in the HMDP model under consideration.
actionWeightMat

Info about the weights of the actions in the HMDP model under consideration.
getPolicy

Get parts of the optimal policy.
getInfo

Information about the MDP
binaryActionWriter

Function for writing actions of a HMDP model to binary files. The function defines sub-functions which can be used to define actions saved in a set of binary files. It is assumed that the states have been defined using binaryMDPWriter and that the id of the states is known (can be retrieved using e.g. stateIdxDf).
actionIdxDf

Info about the actions in the HMDP model under consideration.
runPolicyIteDiscount

Perform policy iteration (discounted reward criterion) on the MDP.
getWIdx

Return the index of a weight in the model. Note that index always start from zero (C++ style), i.e. the first weight, the first state at a stage etc has index 0.
hmpMDPWriter

Function for writing an HMDP model to a hmp file (XML). The function define sub-functions which can be used to define an HMDP model stored in a hmp file.
runValueIte

Perform value iteration on the MDP.
checkWDurIdx

Internal function. Check if the indexes given are okay. Should not be used except you know what you are doing
binaryMDPWriter

Function for writing an HMDP model to binary files. The function defines sub-functions which can be used to define an HMDP model saved in a set of binary files.
plotHypergraph

Plot parts of the state expanded hypergraph (experimental).
runPolicyIteAve

Perform policy iteration (average reward criterion) on the MDP.
runCalcWeights

Calculate weights based on current policy. Normally run after an optimal policy has been found.
randomHMDP

Generate a "random" HMDP stored in a set of binary files.
convertBinary2HMP

Convert a HMDP model stored in binary format to a hmp (XML) file. The function simply parse the binary files and create hmp files using the hmpMDPWriter().
checkWIdx

Internal function. Check if the index of the weight is okay. Should not be used except you know what you are doing
saveMDP

Save the MDP to binary files
getHypergraph

Return the (parts of) state-expanded hypergraph
getBinInfoStates

Info about the states in the binary files of the HMDP model under consideration.
convertHMP2Binary

Convert a HMDP model stored in a hmp (xml) file to binary file format.
setPolicy

Modify the current policy by setting policy action of states.
getBinInfoActions

Info about the actions in the HMDP model under consideration.
transProbMat

Info about the transition probabilities in the HMDP model under consideration.
stateIdxDf

Info about the states in the HMDP model under consideration.
getRPO

Calculate the retention pay-off (RPO) or opportunity cost for some states.
stateIdxMat

Info about the states in the HMDP model under consideration.
getSteadyStatePr

Calculate the steady state transition probabilities for the founder process (level 0).
loadMDP

Load the HMDP model defined in the binary files. The model are created in memory using the external C++ library.
plot.HMDP

Plot the state-expanded hypergraph of the MDP.
weightNames

Names of weights used in actions.
actionIdxMat

Info about the actions in the HMDP model under consideration.