Learn R Programming

multiRL (version 0.2.3)

Reinforcement Learning Tools for Multi-Armed Bandit

Description

A flexible general-purpose toolbox for implementing Rescorla-Wagner models in multi-armed bandit tasks. As the successor and functional extension of the 'binaryRL' package, 'multiRL' modularizes the Markov Decision Process (MDP) into six core components. This framework enables users to construct custom models via intuitive if-else syntax and define latent learning rules for agents. For parameter estimation, it provides both likelihood-based inference (MLE and MAP) and simulation-based inference (ABC and RNN), with full support for parallel processing across subjects. The workflow is highly standardized, featuring four main functions that strictly follow the four-step protocol (and ten rules) proposed by Wilson & Collins (2019) . Beyond the three built-in models (TD, RSTD, and Utility), users can easily derive new variants by declaring which variables are treated as free parameters.

Copy Link

Version

Install

install.packages('multiRL')

Version

0.2.3

License

GPL-3

Maintainer

YuKi

Last Published

January 26th, 2026

Functions in multiRL (0.2.3)

estimate_0_ENV

Tool for Generating an Environment for Models
estimate

Estimate Methods
estimate_2_SBI

Simulated-Based Inference (SBI)
estimate_2_RNN

Estimation Method: Recurrent Neural Network (RNN)
estimate_1_LBI

Likelihood-Based Inference (LBI)
estimate_1_MLE

Estimation Method: Maximum Likelihood Estimation (MLE)
estimate_1_MAP

Estimation Method: Maximum A Posteriori (MAP)
estimate_2_ABC

Estimation Method: Approximate Bayesian Computation (ABC)
engine_ABC

The Engine of Approximate Bayesian Computation (ABC)
engine_RNN

The Engine of Recurrent Neural Network (RNN)
func_zeta

Function: Decay Rate
funcs

Core Functions
func_alpha

Function: Learning Rate
fit_p

Step 3: Optimizing parameters to fit real data
func_delta

Function: Upper-Confidence-Bound
estimation_methods

Estimate Methods
func_epsilon

Function: \(\epsilon\)–first, Greedy, Decreasing
func_beta

Function: Soft-Max
multiRL-package

multiRL: Reinforcement Learning Tools for Multi-Armed Bandit
func_gamma

Function: Utility Function
policy

Policy of Agent
process_1_input

multiRL.input
process_3_record

multiRL.record
plot.multiRL.replay

plot.multiRL.replay
process_2_behrule

multiRL.behrule
priors

Density and Random Function
process_4_output_cpp

multiRL.output
params

Model Parameters
process_5_metric

multiRL.metric
process_4_output_r

multiRL.output
run_m

Step 1: Building reinforcement learning model
rpl_e

Step 4: Replaying the experiment with optimal parameters
system

Cognitive Processing System
settings

Settings of Model
summary,multiRL.model-method

summary
rcv_d

Step 2: Generating fake data for parameter and model recovery
colnames

Column Names
algorithm

Algorithm Packages
MAB

Simulated Multi-Arm Bandit Dataset
Utility

Utility Model
TD

Temporal Differences Model
TAB

Group 2 from Mason et al. (2024)
behrule

Behavior Rules
RSTD

Risk Sensitive Model
data

Dataset Structure
control

Control Algorithm Behavior