func_alpha: Function: Learning Rate

Description

$$Q_{new} = Q_{old} + \alpha \cdot (R - Q_{old})$$

Usage

func_alpha(qvalue, reward, params, system, ...)

Value

A NumericVector containing the updated values computed based on the learning rate.

Arguments

qvalue

The expected Q values of different behaviors produced by different systems when updated to this trial.

reward

The feedback received by the agent from the environment at trial(t) following the execution of action(a)

params

Parameters used by the model’s internal functions, see params

system

When the agent makes a decision, is a single system at work, or are multiple systems involved? see system

...

It currently contains the following information; additional information may be added in future package versions.

idinfo:
- subid
- block
- trial
exinfo: contains information whose column names are specified by the user.
- Frame
- RT
- NetWorth
- ...
behave: includes the following:
- action: the behavior performed by the human in the given trial.
- latent: the object updated by the agent in the given trial.
- simulation: the actual behavior performed by the agent.

Body

func_alpha <- function(
    qvalue,
    reward,
    params,
    ...
){
  list2env(list(...), envir = environment())
  
  # If you need extra information(...)
  # Column names may be lost(C++), indexes are recommended
  # e.g.
  # Trial  <- idinfo[3]
  # Frame  <- exinfo[1]
  # Action <- behave[1]
  
  alpha     <-  params[["alpha"]]
  alphaN    <-  params[["alphaN"]]
  alphaP    <-  params[["alphaP"]]
  
  # Determine the model currently in use based on which parameters are free.
  if (
    system == "RL" && !(is.null(alpha)) && is.null(alphaN) && is.null(alphaP)
  ) {
    model <- "TD"
  } else if (
    system == "RL" && is.null(alpha) && !(is.null(alphaN)) && !(is.null(alphaP))
  ) {
    model <- "RSTD"
  } else if (
    system == "WM"
  ) {
    model <- "WM"
    alpha <- 1
  } else {
    stop("Unknown Model! Plase modify your learning rate function")
  }
  
  # TD
  if (model == "TD") {
    update <- qvalue + alpha   * (reward - qvalue)
  # RSTD
  } else if (model == "RSTD" && reward < qvalue) {
    update <- qvalue + alphaN * (reward - qvalue)
  } else if (model == "RSTD" && reward >= qvalue) {
    update <- qvalue + alphaP * (reward - qvalue)
  # WM
  } else if (model == "WM") {
    update <- qvalue + alpha  * (reward - qvalue)
  }
  
  return(update)
}