Learn R Programming

RemixAutoML (version 0.4.2)

RPM_Binomial_Bandit: RPM_Binomial_Bandit

Description

RPM_Binomial_Bandit computes randomized probability matching probabilities for each arm being best in a multi-armed bandit. Close cousin to Thomson Sampling.

Usage

RPM_Binomial_Bandit(
  Success,
  Trials,
  Alpha = 1L,
  Beta = 1L,
  SubDivisions = 1000L
)

Arguments

Success

Vector of successes. One slot per arm.

Trials

Vector of trials. One slot per arm.

Alpha

Prior parameter for success

Beta

Prior parameter for trials

SubDivisions

Default is 100L in the stats package. Changed it to 1000 for this function.

Value

Probability of each arm being the best arm compared to all other arms.

See Also

Other Reinforcement Learning: RL_Initialize(), RL_ML_Update(), RL_Update()