Simulate the expected number of trials on arm B before switching to the known arm A, and the expected reward.
simOAB(N, p, al, k, gam, Ns)
mean value at the stopping time
standard deviation of the value at the stopping time
mean value of the expected reward
standard deviation of the expected reward
number of trials.
the probability of reward on arm B (unknown).
the known probability of reward on arm A.
the initial sample size on arm B.
Bayesian confidence level.
number of runs in the simulation.
Shelemyahu Zacks
dynOAB
set.seed(123)
simOAB(N = 50, p = 0.6, al = 0.5, k = 10, gam = 0.95, Ns = 1000)
Run the code above in your browser using DataLab