fitRTConf: Function for fitting sequential sampling confidence models

Description

Fits the parameters of different models of response time and confidence, including the 2DSD model (Pleskac & Busemeyer, 2010), dynWEV, DDMConf, and various flavors of race models (Hellmann et al., 2023). Which model to fit is specified by the argument model. Only a ML method is implemented. See dWEV, d2DSD, and dRM for more information about the parameters and Details for not-fitted parameters.

Usage

fitRTConf(data, model = "dynWEV", fixed = list(sym_thetas = FALSE),
  init_grid = NULL, grid_search = TRUE, data_names = list(),
  nRatings = NULL, restr_tau = Inf, precision = 1e-05, logging = FALSE,
  opts = list(), optim_method = "bobyqa", useparallel = FALSE,
  n.cores = NULL, ...)

Value

Gives a one-row data frame with columns for the different parameters as fitted result as well as additional information about the fit (negLogLik (for final parameters), k (number of parameters), N (number of data rows), BIC, AICc and AIC) and the column fixed, which includes all information about fixed and not fitted parameters.

Arguments

data

a data.frame where each row is one trial, containing following variables (column names can be changed by passing additional arguments of the form condition="contrast"):

condition (not necessary; for different levels of stimulus quality, will be transformed to a factor),
rating (discrete confidence judgments, should be given as integer vector; otherwise will be transformed to integer),
rt (giving the reaction times for the decision task),
either 2 of the following (see details for more information about the accepted formats):
- stimulus (encoding the stimulus category in a binary choice task),
- response (encoding the decision response),
- correct (encoding whether the decision was correct; values in 0, 1)
sbj or participant (optional; giving the subject ID; only relevant if logging = TRUE; if unique the ID is used in saved files with interim results and logging messages; if non-unique or missing and logging =TRUE, 999 will be used then)

model

character scalar. One of "dynWEV", "2DSD", "IRM", "PCRM", "IRMt", "PCRMt", or "DDMConf" for the model to be fit.

fixed

list. List with parameter-value pairs for parameters that should not be fitted. See Details.

init_grid

data.frame or NULL. Grid for the initial parameter search. Each row is one parameter constellation. See details for more information. If NULL a default grid will be used.

grid_search

logical. If FALSE, the grid search before the optimization algorithm is omitted. The fitting is then started with a mean parameter set from the default grid (if init_grid=NULL) or directly with the rows from init_grid, if not NULL. (Default: TRUE)

data_names

named list (e.g. c(rating="confidence")). Alternative possibility of giving other column names for the variables in the data. By default column names are identical to the ones given in the data argument description.

nRatings

integer. Number of rating categories. If NULL, the maximum of rating and length(unique(rating)) is used. This argument is especially important for data sets where not the whole range of rating categories is realized. If given, ratings has to be given as factor or integer.

restr_tau

numerical or Inf or "simult_conf". For 2DSD and dynWEV only. Upper bound for tau. Fits will be in the interval (0,restr_tau). If FALSE tau will be unbound. For "simult_conf", see the documentation of d2DSD and dWEV

precision

numerical scalar. For 2DSD and dynWEV only. Precision of calculation. (in the respective models) for the density functions (see dWEV for more information).

logging

logical. If TRUE, a folder 'autosave/fitmodel' is created and messages about the process are printed in a logging file and to console (depending on OS). Additionally intermediate results are saved in a .RData file with the participant ID in the name.

opts

list. A list for more control options in the optimization routines (depending on the optim_method). See details for more information.

optim_method

character. Determines which optimization function is used for the parameter estimation. Either "bobyqa" (default), "L-BFGS-B" or "Nelder-Mead". "bobyqa" uses a box-constrained optimization with quadratic interpolation. (See bobyqa for more information.) The first two use a box-constraint optimization. For Nelder-Mead a transfinite function rescaling is used (i.e. the constrained arguments are suitably transformed to the whole real line).

useparallel

logical. If TRUE the grid search in the beginning is done with a parallel back-end, using the parallel package.

n.cores

integer or NULL. Number of cores used for parallelization. If NULL (default) the number of available cores -1 is used.

...

Possibility of giving alternative variable names in data frame (in the form condition = "SOA", or response="pressedKey").

Author

Sebastian Hellmann.

Details

The fitting involves a first grid search through computation of the likelihood on an initial grid with possible sets of parameters to start the optimization routine. Then the best nAttempts parameter sets are chosen for an optimization, which is done with an algorithm, depending on the argument optim-method. The Nelder-Mead algorithm uses the R function optim. The optimization routine is restarted nRestarts times with the starting parameter set equal to the best parameters from the previous routine.

stimulus, response and correct. Two of these columns must be given in data. If all three are given, correct will have no effect (and will be not checked!). stimulus can always be given in numerical format with values -1 and 1. response can always be given as a character vector with "lower" and "upper" as values. Correct must always be given as a 0-1-vector. If the stimulus column is given together with a response column and they both do not match the above format, they need to have the same values/levels (if factor). In the case that only stimulus/response is given in any other format together with correct, the unique values will be sorted increasingly and the first value will be encoded as "lower"/-1 and the second as "upper"/+1.

fixed. Parameters that should not be fitted but kept constant. These will be dropped from the initial grid search but will be present in the output, to keep all parameters for prediction in the result. Includes the possibility for symmetric confidence thresholds for both alternative (sym_thetas=logical). Other examples are z =.5, sv=0, st0=0, sz=0. For race models, the possibility of setting a='b' (or vice versa) leads to identical upper bounds on the decision processes, which is the equivalence for z=.5 in a diffusion process.

Parameters not fitted. The models get developed continuously and not all changes are adopted in the fitting function instantly. Following parameters are currently not included in the fitting routine:

in race models: sza, szb, smu1, and smu2

init_grid. Each row should be one parameter set to check. The column names should include the parameters of the desired model, which are the following for 2DSD: a, vmin and vmax (will be equidistantly spanned across conditions), sv, z (as the relative starting point between 0 and a), sz (also in relative terms), t0, st0, theta0 (minimal threshold), thetamax (maximal threshold; the others will be equidistantly spanned symmetrically for both decisions), and tau. For dynWEV, additionally w , svis, and sigvis are required. For the race models the parameters are: vmin, vmax (will be equidistantly spanned across conditions), a and b (decision thresholds), t0, st0, theta0 (minimal threshold), thetamax (maximal threshold; the others will be equidistantly spanned symmetrically for both decisions), and for time-dependent confidence race models additionally wrt and wint (as weights compared to wx=1).

opts. A list with numerical values. Possible options are listed below (together with the optimization method they are used for).

nAttempts (all) number of best performing initial parameter sets used for optimization; default 5, if grid_search is TRUE. If grid_search is FALSE and init_grid is NULL, then nAttempts will be set to 1 (and any input will be ignored). If grid_search is FALSE and init_grid is not NULL, the rows of init_grid will be used from top to bottom (since no initial grid search is done) with not more than nAttempts rows used.
nRestarts (all) number of successive optim routines for each of the starting parameter sets; default 5,
maxfun ('bobyqa') maximum number of function evaluations; default: 5000,
maxit ('Nelder-Mead' and 'L-BFGS-B') maximum iterations; default: 2000,
reltol ('Nelder-Mead') relative tolerance; default: 1e-6),
factr ('L-BFGS-B') tolerance in terms of reduction factor of the objective, default: 1e-10)

References

Hellmann, S., Zehetleitner, M., & Rausch, M. (2023). Simultaneous modeling of choice, confidence and response time in visual perception. Psychological Review 2023 Mar 13. doi: 10.1037/rev0000411. Epub ahead of print. PMID: 36913292.

https://nashjc.wordpress.com/2016/11/10/why-optim-is-out-of-date/

https://www.damtp.cam.ac.uk/user/na/NA_papers/NA2009_06.pdf

Examples

Run this code

# We use one of the implemented models, "dynWEV"
# 1. Generate data
# data with positive drift (stimulus = "upper")
data <- rWEV(20, a=2,v=0.5,t0=0.2,z=0.5, sz=0.1,sv=0.1, st0=0,  tau=4, s=1, w=0.3)
data$stimulus <- "upper"
# data with negtive drift (stimulus = "lower") but same intensity
data2 <- rWEV(100, a=2,v=-0.5,t0=0.2,z=0.5,sz=0.1,sv=0.1, st0=0,  tau=4, s=1, w=0.3)
data2$stimulus <- "lower"
data <- rbind(data, data2)
# Transfer response column and add dummy condition column
data$response <- ifelse(data$response==1, "upper", "lower")
data$condition <- 1
# Take some confidence thresholds for discrete ratings
threshs <- c(-Inf, 1, 2, Inf)
data$rating <- as.numeric(cut(data$conf, breaks = threshs, include.lowest = TRUE))
head(data)

# 2. Use fitting function
# Fitting the model with these opts results in a pretty bad fit
# (especially because of omitting the grid_search)
# \donttest{
   fitRTConf(data, "dynWEV", fixed=list(sym_thetas=TRUE, z=0.5, st0=0),
            grid_search = FALSE, logging=FALSE,
            opts = list(nAttempts=1, nRestarts=2, maxfun=2000))
 # }

Run the code above in your browser using DataLab