system: Cognitive Processing System

Description

In a Markov Decision Process, an agent may not update only a single Q-value table. In other words, the process may not be governed by a single cognitive processing system, but rather by a weighted combination of multiple cognitive systems. Specifically, each cognitive processing system updates its own Q-value table and, based on that table, derives the probabilities of executing each action on a given trial. The agent then combines the action-selection probabilities provided by each cognitive system using weights to obtain the final probability of executing each action.

Arguments

Class

system [Character]

Detail

Reinforcement Learning: An incremental cognitive processing system that integrates reward history over long timescales to build stable action-value representations through prediction errors. It is robust but slow to adapt to sudden changes.
Working Memory: A rapid-acquisition cognitive processing system that allows for near-instantaneous updating of stimulus-response associations. However, its contribution is strictly constrained by limited storage capacity and is highly susceptible to decay over time or interference from intervening trials.

Example

system = "RL": A single-system model based on incremental Reinforcement Learning (RL). The agent updates option values using a learning rate (alpha) typically less than 1, representing a slow, integrative process linked to corticostriatal circuitry.
system = "WM": A single-system model representing Working Memory (WM). Unlike RL, this system has the capacity to instantly update values with a fixed learning rate of 1, effectively "remembering" the most recent outcome for each stimulus.
system = c("RL", "WM"): A hybrid model where Reinforcement Learning (RL) and Working Memory (WM) systems operate in parallel, maintaining two distinct Q-value tables. The final decision is a weighted integration of both systems' choice probabilities. The contribution of Working Memory (WM) is constrained by its capacity; if the stimulus set size exceeds capacity, the agent's reliance shifts toward the Reinforcement Learning (RL) system as the Working Memory (WM) reliability diminishes. See capacity in params for details.

If one assumes that multiple cognitive processing systems are involved in the Markov Decision Process, their relative influence can be controlled by assigning weights to each system. See weight in params for details.

References

Collins, A. G., & Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35(7), 1024-1035. tools:::Rd_expr_doi("10.1111/j.1460-9568.2011.07980.x")