Learn R Programming

mistat (version 2.0.4)

dynOAB: Dynamic programming of the optimal One-Armed Bandits

Description

Dynamic programming of the optimal One-Armed Bernoulli Bandits process

Usage

dynOAB(N, al)

Value

For dynOAB the matrix of maximal predicted rewards. For dynOAB2 the optimal predicted reward.

Arguments

N

number of trials.

al

the known probability of reward on arm A.

Author

Shelemyahu Zacks

See Also

simOAB

Examples

Run this code
dynOAB(10, 0.5)
dynOAB2(10, 0.5)

Run the code above in your browser using DataLab