end_mab_trial: Ends Multi-Arm Bandit Trial

Description

Condenses output from run_mab_trial() into manageable structure.

Usage

end_mab_trial(data, bandits, algorithm, periods, conditions, ndraws)

Value

A named list containing:

final_data: The processed tibble or data.table, containing new columns pertaining to the results of the trial.
bandits: A tibble or data.table containing the UCB1 values or Thompson sampling posterior distributions for each period.
assignment_probs: A tibble or data.table containing the probability of being assigned each treatment arm at a given period.

Arguments

data: Finalized data from run_mab_trial().
bandits: Finalized bandits list from run_mab_trial().
algorithm: A character string specifying the MAB algorithm to use. Options are "thompson" or "ucb1". Algorithm defines the adaptive assignment process. Mathematical details on these algorithms can be found in Kuleshov and Precup 2014 and Slivkins 2024.
periods: Numeric value of length 1; total number of periods in Multi-Arm-Bandit trial.
ndraws: A numeric value; When Thompson sampling direct calculations fail, draws from a simulated posterior will be used to approximate the Thompson sampling probabilities. This is the number of simulations to use, the default is 5000 to match the default parameter bandit::best_binomial_bandit_sim(), but might need to be raised or lowered depending on performance and accuracy concerns.

Details

Takes the bandit lists provided, and condenses them using dplyr::bind_rows() into tibbles or data.tables, and then pivots the table to wide format where each treatment arm is a column, and the rows represent periods.

At this step the final UCB1 or Thompson sampling probabilities are calculated. The entire table is shifted backward by one period so that each row reflects the calculation that occurs after completing a period. For example prior to this change, row 11, would indicate the calculations from period 11 before assignment, but now that occured after period 11's imputations.

This has the impact of removing the original first row, where all the assignment probabilities are equal, and modifying the last row to represent the final calculation after the conclusion of the simulation.

The assignment probabilities are not changed in this way, so for each period they still reflect the assignment probabilities used in that period.

Description

Usage

Value

Arguments

Details

See Also