Learn R Programming

⚠️There's a newer version (0.2.6.2) of this package.Take me there.

Dynamic Models of Choice with Better Graphic Tools and Quicker Computations

ggdmc, evolving from dynamic model of choice (DMC, Heathcote et al., 2018), is a generic tool for hierarchical Bayesian Computations.

  1. Instead of using Gibbs or HMC, ggdmc uses population-based MCMC (pMCMC)

samplers. A notable Gibbs example is the Python-based HDDM (Wiecki, Sofer & Frank, 2013), which does not allow the user to conveniently set the variabilities of DDM parameters.

  1. ggdmc uses a different variant of migration operator, which safeguards

the detailed balance. It is not imperative to turn off the migration operator. For some complex models, it is preferalbe to use a mixture of differernt genetic operators to prevent premature convergence. Further, pMCMC is more efficient when a combination of operators is used.

  1. ggdmc uses two parallel methods. First is via the parallel package in R.

This facilitates the computations for fitting many participants, namely fixed-effects models. The second is via OpenMP at C++ level. This facilitates the computations for fitting hierarchical models. For an advanced parallel computation technique / algorithm, please see my R package, ppda, which implements CUDA C language for massive parallel computations.

Getting Started

Below is an example using the LBA Model (Brown & Heathcote, 2008). Please see my tutorials site for more details. The names of R functions in ggdmc mostly informs what they are doing, such as BuildModel. The syntax differs from DMC; Nevertheless, although ggdmc seemingly similar with DMC, its core of Bayesian sampling is a new implementation, because the different method of parallelism and pMCMC algorithms. Please use with your own risk.

Note the sequence of parameters in a parameter vector (i.e., p.vector) must strictly follow the sequence in the p.vector reported by BuildModel. Otherwise, run function will throw an error message to stop you from fitting a model.

require(ggdmc) 
model <- BuildModel(p.map = list(A = "1", B = "R", t0 = "1",
                            mean_v = c("F", "M"), sd_v = "M", st0 = "1"),
          match.map = list(M = list(s1 = 1, s2 = 2)),
          factors   = list(S = c("s1", "s2"), F = c("f1", "f2")),
          constants = c(sd_v.false = 1, st0 = 0), 
          responses = c("r1", "r2"),
          type      = "norm")

## Population distribution, rate effect on F
pop.mean <- c(A = .4, B.r1 = 1.2, B.r2 = 2.8, t0 = .2,
              mean_v.f1.true = 2.5, mean_v.f2.true = 1.5, mean_v.f1.false = .35,
              mean_v.f2.false = .25, sd_v.true = .25)
pop.scale <- c(A = .1, B.r1 = .1, B.r2 = .1, t0 = .05,
               mean_v.f1.true = .2, mean_v.f2.true = .2, mean_v.f1.false = .2,
               mean_v.f2.false = .2, sd_v.true = .1)
pop.prior <- BuildPrior(
  dists = rep("tnorm", 9),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,   0,  0, .1, NA, NA, NA,  0,  0),
  upper = c(NA, NA, NA,  1, NA, NA, NA, NA, NA))

## Draw parameter prior distributions to visually check
plot(pop.prior)
print(pop.prior)

## Simulate 20 participants, each condition has 30 responses.
## The true parameter vectors are drawn from parameter prior distribution
## specified in 'pop.prior' 
dat <- simulate(model, nsim = 30, nsub = 20, prior = pop.prior)
dmi <- BuildDMI(dat, model)    ## dmi = data model instance 

## Extract the mean and variabilities of parameters across the 40 participants
## ps <- attr(dat, "parameters")
##
## Please use matrixStats package, which is even faster than the C functions 
## in base package
##
## round(matrixStats::colMeans2(ps), 2)
## round(matrixStats::colSds(ps), 2)
##    A  B.r1  B.r2  mean_v.f1.true  mean_v.f2.true mean_v.f1.false 
## 0.43  1.22  1.00            0.21            2.48            1.45 
## 0.10  0.10  0.01            0.04            0.17            0.19 
## mean_v.f2.false  sd_v.true     t0
##            0.34       0.36   0.23
##            0.16       0.18   0.10
           
## FIT hierarchical model
p.prior <- BuildPrior(
  dists = rep("tnorm", 9),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,   0, 0, .1, NA, NA, NA,  0,  0),
  upper = c(NA, NA, 1, NA, NA, NA, NA, NA, NA))

## Specify prior distributions at the hyper level
mu.prior <- BuildPrior(
  dists = rep("tnorm", 9),
  p1    = pop.mean,                           
  p2    = c(1,   1,  1,  2,   2,  2,  2,  1, 1),
  lower = c(0,   0,  0, .1,  NA, NA, NA, NA, 0),
  upper = c(NA, NA, NA, NA,  NA, NA, NA, NA, NA))

## lower and upper are taken care of by defaults.
sigma.prior <- BuildPrior(
  dists = rep("beta", 9),
  p1    = c(A=1, B.r1=1, B.r2=1, t0 = 1, mean_v.f1.true=1, mean_v.f2.true=1,
            mean_v.f1.false=1, mean_v.f2.false=1, sd_v.true = 1),
  p2    = rep(1, 9))

pp.prior <- list(mu.prior, sigma.prior)

## Visually check mu priors and sigma priors
plot(pp.prior[[1]])
plot(pp.prior[[2]])

## Initialise a small sample 
## Get the number of parameters
npar <- length(GetPNames(model)); npar
thin <- 2

## Initiate 100 new hierarchical samples and specify (randomly) only the first 
## iteration.  Others are NA or -Inf. 
hsam <- StartNewHypersamples(1e2, dmi, p.prior, pp.prior)
hsam <- run(hsam, report = 20, pm = .3, hpm = .3)

How to pipe DMC samples to ggdmc samplers

Diffusion decision model (Ratcliff & McKoon, 2008) is one of the most popular cognitive models to fit RT data in cognitive psychology. Here we show two examples, one fitting the LBA model and the other fitting DDM.

###################
##   LBA model   ##
###################
## DMC could be downloaded at "osf.io/pbwx8".
setwd("~/Documents/DMCpaper/")
source ("dmc/dmc.R")
load_model ("LBA","lba_B.R")
setwd("~/Documents/ggdmc_paper/")
## load("data/dmc_pipe_LBA.rda")

model <- model.dmc(p.map = list( A = "1", B = "R", t0 = "1",
                                mean_v = c("F", "M"), sd_v = "M", st0 = "1"),
          match.map = list(M = list(s1 = 1, s2 = 2)),
          factors = list(S = c("s1", "s2"), F = c("f1", "f2")),
          constants = c(sd_v.false = 1, st0 = 0),
          responses = c("r1", "r2"),
          type = "norm")
pop.mean <- c(A=.4, B.r1=.6, B.r2=.8, t0=.3, mean_v.f1.true=1.5, 
              mean_v.f2.true=1, mean_v.f1.false=0, mean_v.f2.false=0,
              sd_v.true = .25)
pop.scale <-c(A=.1, B.r1=.1, B.r2=.1, t0=.05, mean_v.f1.true=.2, 
              mean_v.f2.true=.2, mean_v.f1.false=.2, mean_v.f2.false=.2,
              sd_v.true = .1)
pop.prior <- prior.p.dmc(
  dists = rep("tnorm",9),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,0,0,NA,NA,NA,NA,0,.1),
  upper = c(NA,NA,NA,NA,NA,NA,NA,NA,1))
raw.data <- h.simulate.dmc(model, p.prior = pop.prior, n = 30, ns = 20)
data.model <- data.model.dmc(raw.data, model)

ps <- attr(raw.data, "parameters")

p.prior <- prior.p.dmc(
  dists = rep("tnorm",9),
  p1    = pop.mean,
  p2    = pop.scale*5,
  lower = c(0,0,0,NA,NA,NA,NA,0,.1),
  upper = c(NA,NA,NA,NA,NA,NA,NA,NA,NA))
mu.prior <- prior.p.dmc(
  dists = rep("tnorm",9),
  p1    = pop.mean,
  p2    = c(1,1,1,2,2,2,2,1,1),
  lower = c(0,0,0,NA,NA,NA,NA,0,.1),
  upper = c(NA,NA,NA,NA,NA,NA,NA,NA,NA))
sigma.prior <- prior.p.dmc(
  dists = rep("beta", 9),
  p1    = c(A=1, B.r1=1, B.r2=1, t0=1, mean_v.f1.true=1, mean_v.f2.true=1, 
            mean_v.f1.false=1, mean_v.f2.false=1, sd_v.true = 1),
  p2    = rep(1,9))
pp.prior <- list(mu.prior, sigma.prior)

hsamples <- h.samples.dmc(nmc = 50, p.prior, data.model, pp.prior = pp.prior,
  thin = 1)

## Piping 
hsam0 <- ggdmc::run(hsamples, pm = .05, report = 10)


## Turn off migration. Default pm = 0
hsam1 <- ggdmc::run(h.samples.dmc(nmc = 50, p.prior, samples = hsam0, 
  pp.prior = pp.prior, thin = 1))
hsam2 <- ggdmc::run(h.samples.dmc(nmc = 50, p.prior, samples = hsam1,
   pp.prior = pp.prior, thin = 1))
hsam3 <- ggdmc::run(h.samples.dmc(nmc = 50, p.prior, samples = hsam2,
  pp.prior = pp.prior, thin = 1))

## Check whether MCMC converge
plot(hsam3, pll = TRUE)
plot(hsam3)


hest1 <- summary(hsam3, hyper = TRUE, recovery = TRUE, ps = pop.mean)
hest2 <- summary(hsam3, hyper = TRUE, recovery = TRUE, ps = pop.scale, type = 2)
est <- summary(hsam3)


###################
##   DDM         ##
###################
rm(list = ls())
setwd("~/Documents/DMCpaper")
source ("dmc/dmc.R")
load_model ("DDM", "ddm.R")
setwd("~/Documents/ggdmc_paper/")
## load("data/hierarchical/dmc_pipe_DDM.rda")
model <- model.dmc(
    p.map     = list(a = "1", v = "F", z = "1", d = "1", sz = "1", sv = "1",
                     t0 = "1", st0 = "1"),
  match.map = list(M = list(s1 = "r1", s2 = "r2")),
  factors   = list(S = c("s1", "s2"), F = c("f1", "f2")),
  constants = c(st0 = 0, d = 0),
  responses = c("r1", "r2"),
  type      = "rd")
  
## Population distribution
pop.mean <- c(a=2,  v.f1=4, v.f2=3, z=0.5, sz=0.3, sv=1, t0=0.3)
pop.scale <-c(a=0.5,v.f1=.5,v.f2=.5,z=0.1, sz=0.1, sv=.3,t0=0.05)
pop.prior <- prior.p.dmc(
   dists = rep("tnorm", 7),
   p1    = pop.mean,
   p2    = pop.scale,
   lower = c(0,-5, -5, 0, 0, 0, 0),
   upper = c(5, 7,  7, 1, 2, 1, 1) )

raw.data   <- h.simulate.dmc(model, p.prior = pop.prior, n = 32, ns = 8)
data.model <- data.model.dmc(raw.data, model)
ps <- attr(raw.data, "parameters")

p.prior <- prior.p.dmc(
  dists = c("tnorm","tnorm","tnorm","tnorm","tnorm","tnorm","tnorm"),
  p1=pop.mean,
  p2=pop.scale*5,
  lower=c(0,-5, -5, 0, 0, 0, 0),
  upper=c(5, 7,  7, 1, 2, 1, 1)
)

mu.prior <- prior.p.dmc(
  dists = c("tnorm","tnorm","tnorm","tnorm","tnorm","tnorm","tnorm"),
  p1=pop.mean,
  p2=pop.scale*5,
  lower=c(0,-5, -5, 0, 0, 0, 0),
  upper=c(5, 7,  7, 1, 2, 1, 1)
)
sigma.prior <- prior.p.dmc(
  dists = rep("beta", length(p.prior)),
  p1=c(a=1, v.f1=1,v.f2 = 1, z=1, sz=1, sv=1, t0=1),
  p2=c(1,1,1,1,1,1,1),
  upper=c(2,2,2,2,2,2,2)
)

pp.prior <- list(mu.prior, sigma.prior)
  
## Sampling  
hsam0 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, data = data.model, thin = 2), pm = .1)

hsam1 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, samples = hsam0, thin = 32), pm = .1)

hsam2 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, samples = hsam1, thin = 128), pm = .3)

hsam3 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, samples = hsam2, thin = 2))

How to fit fixed-effect model with multiple participants

require(ggdmc)
## Model Setup----------
model <- BuildModel(p.map = list( A = "1", B = "R", t0 = "1",
   mean_v = c("F", "M"), sd_v = "M", st0 = "1"),
   match.map = list(M = list(s1 = 1, s2 = 2)),
   factors = list(S = c("s1", "s2"), F = c("f1", "f2")),
   constants = c(sd_v.false = 1, st0 = 0),
   responses = c("r1", "r2"),
   type = "norm")
   
npar <- length(model)

## Population distribution, rate effect on F
pop.mean <- c(A = .4, B.r1 = .6, B.r2 = .8, t0 = .3, mean_v.f1.true = 1.5,
  mean_v.f2.true = 1, mean_v.f1.false = 0, mean_v.f2.false = 0, sd_v.true = .25)
pop.scale <-c(A = .1, B.r1 = .1, B.r2 = .1, t0 = .05, mean_v.f1.true = .2,
  mean_v.f2.true = .2, mean_v.f1.false = .2, mean_v.f2.false = .2, sd_v.true = .1)
pop.prior <- BuildPrior(
  dists = rep("tnorm", npar),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,   0,  0, NA, NA, NA, NA, 0, .1),
  upper = c(NA, NA, NA, NA, NA, NA, NA, NA, 1))

## Simulate some data
nsubject <- 8
ntrial <- 1e2
dat <- simulate(model, nsim = ntrial, nsub = nsubject, prior = pop.prior)
dmi <- BuildDMI(dat, model)
ps  <- attr(dat, "parameters")

# round( matrixStats::colMeans2(ps), 2)
# round( matrixStats::colSds(ps), 2)

## FIT FIXED-EFFECT MODEL----------
## Use all truncated normal priors for locations
p.prior <- BuildPrior(
  dists = rep("tnorm", npar),
  p1    = pop.mean,
  p2    = pop.scale * 5,
  lower = c(0,   0,  0, NA, NA, NA, NA,  0, .1),
  upper = c(NA, NA, NA, NA, NA, NA, NA, NA, NA))

thin <- 8
nchain <- npar * 3
migrationRate <- .2
sam <- run(StartManynewsamples(512, dmi, p.prior, thin, nchain),
  pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam), pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam, thin = 16), pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam, thin = 64), pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam, thin = 16), pm = migrationRate, ncore = 8)

plot(sam[[6]])
plot(sam)  ## This will take a while, because ploting many participants

How to conduct automatic convergence checks

One challenge in Bayesian modeling is to make sure posterior samples are from target distribuitons. When using the DE-MCMC sampler to fit very complex models, some regions in parameter spaces might be difficult to handle. Here we provide one method to conduct automatic sampling by repeatedly fitting models until posterior samples are proper.

First, we convert the first stage samples (i.e., hsam0) to a generic object, hsam. Then, we use the repeat function to iterate model fits. Meanwhile, we use hgelman to check whether both the PSRFs at the hyper- and data-level are smaller than 1.1, suggesting that chains are well mixed.

thin <- 2
hsam <- hsam0
counter <- 1
repeat {
  hsam <- run(RestartHypersamples(5e2, hsam, thin = thin),
    pm = .3, hpm = .3)
  save(hsam, file = "data/tmp_posterior_samples.rda")
  rhats <- hgelman(hsam)
  counter <- counter + 1
  thin <- thin * 2
  if (all(rhats < 1.1) || counter > 1e2) break
}

List of models currently hard-wired in ggdmc

  1. The LBA model, type = "norm",
  2. The DDM, type = "rd",
  3. The Piecewise LBA model 0; CPU-based PDA likelihoods; type = "plba0",
  4. The Piecewise LBA model 1; CPU-based PDA likelihoods; type = "plba1",
  5. The Piecewise LBA model 0; GPU-based PDA likelihoods; type = "plba0_gpu",
  6. The Piecewise LBA model 1; GPU-based PDA likelihoods; type = "plba1_gpu",
  7. The LBA model; GPU-based PDA likelihoods;, type = "norm_pda_gpu",
  8. The correlated accumualtor model; type = "cnorm".

For the details regarding PLBA types, please see Holmes, Trueblood, and Heathcote (2016)

Further information

Please see my tutorials site, Cognitive Model, for more details.

Prerequisites

  • R (>= 3.4.0)
  • Rcpp (>= 0.12.10), RcppArmadillo (>= 0.7.700.3.0), ggplot2 (>= 2.1.0), rtdists (>= 0.6-6), ggmcmc (>= 0.7.3), coda (>= 0.16-1) tmvtnorm, matrixStats, data.table
  • Windows users need Rtools (>= 3.3.0.1959)
  • Mac OS users need to make clang understand OpenMP flag
  • Linux/Unix users may need to install Open MPI library, if it has not been installed.
  • Armadillo requires a recent compiler; for the g++ family at least version 4.6.*

Installation

CRAN method is currently unavailable. The latest version of the package is still in the process of CRAN standard software checks.

~~From CRAN: ~~ > install.packages("ggdmc")

From source:

install.packages("ggdmc_0.2.5.2.tar.gz", repos = NULL, type="source")

From GitHub (you will need to install devtools:

devtools::install_github(“yxlin/ggdmc”)

For Mac Users:

  1. You will need to install gfortran.

As to 27, Aug, 2018, please install gfortran 6.1, even you are using a mscOS High Sierra Version 10.13.4. gfortran 6.3 may not work.

  1. Then install clang4-r.

James Balamuta has created a convenient tool, clang4-r. Once you install clang4-r, your clang will then understand the OpenMP flag in ggdmc. You may use other methods he suggests to allow Mac to understand OpenMP flag, but they are usually less straightforward.

Please visit his site for detailed explanations about the clang-OpenMP issue.

Citation

If you use this package, please cite the software, for example:

Lin, Y.-S., & Heathcote, A. Population-based Monte Carlo is as Effective as Hamiltonian Monte Carlo in Fitting High Dimension Cognitive Model. Manuscript in preparation. Manuscript in preparation. Retrieved from https://github.com/yxlin/ggdmc

Lin, Y.-S., Heathcote, A., & Holmes, W. R (submitted). Parallel Probability Density Approximation. Behavior Research Methods.

Lin, Y.-S., Strickland, L., Reynold, A., & Heathcote, A. (manuscript in preparation). Tutorial on Bayesian cognitive modeling.

Contributors

The R documentation, tutorials, C++ codes, R helper functions and pacakging are developed by Yi-Shin Lin. DMC is developed by Andrew Heathcote (Heathcote et al., 2018). Please report bugs to me.

License

GPL-2

Acknowledgments

  • Early version of density.cpp is based on Voss & Voss's (2012) density.c in

fast-dm 30.2.

  • Early version of the truncated normal distributions is based on Jonathan

Olmsted's jpolmsted@gmail.com RcppTN 0.1-8 (https://github.com/olmjo/RcppTN) and Christopher Jackson's chris.jackson@mrc-bsu.cam.ac.uk R codes in msm package.

  • Armadillo is a collection of C++ library for performing linear

algebra http://arma.sourceforge.net/, authored by Conrad Sanderson.

  • Thanks to Matthew Gretton's consulation about rtdists.
  • Thanks to Andrew Heathcote for lending MacBook Air for facilitating tests of

ggdmc on OS X (mscOS High Sierra Version 10.13.4)

Copy Link

Version

Install

install.packages('ggdmc')

Monthly Downloads

437

Version

0.2.5.2

License

GPL-2

Issues

Pull Requests

Stars

Forks

Maintainer

Yi-Shin Lin

Last Published

September 1st, 2018

Functions in ggdmc (0.2.5.2)

TableParameters

Table response and parameter
CheckDMI

Check Data Model Instance
fac2df

Convert factor levels to a data frame
dgamma_l

A slightly modified dgamma and rgamma functions
ac

Calculate autocorrelation
n1PDF_cnorm

Likelihood function for correlated accumulator model
dconstant

A pseudo constant function to get constant densities
profile.model

Profile a model object
n1PDF_gpu

A Rcpp connection to the GPU-based LBA n1PDF in ppda package
g_minus

Calculate Drift-diffusion Probability Density
check_pvec

Check if parameter vector compatible with model object?
random

Generate random variates of various cognitive models
GetPNames

Extract parameter names from a model object
assign_pp

Slot pp values into p.prior
MakeForce

Create a vector for re-calculation
PickStuck

Which chains get stuck
GetTheta0

Extract Start Posterior Sample
StartNewsamples

Initialize New Samples
GetParameterMatrix

Return ns-npar matrix
autocor

Autocorrelation Plot
dlnorm_l

A slightly modified dlnorm and rlnorm functions
effectiveSize_hyper

Effective sample size
maker

Canonical Linear Ballistic Accumulation/Accumualtor Model
mcmc_list.model

Create a MCMC list
dtnorm

Truncated Normal Distribution
cellIdx2Mat

Convert cell index list to a matrix
checkddm2

Check parameter vector of DDM
plot_prior

Plot Prior Distributions
dbeta_lu

A slightly modified dbeta and rbeta functions
spdf

Simulated likelihood functions for RT models
MakeLevelArray

Create an array of all factorial combinations of factor levels
rlnr

Generate Random Choice RT Data from LNR Model
ismixed

Four pMCMC checking tools for automatic sampling
print.prior

Print Prior Distribution
dprior

Probability densities of prior distributions
PickChains

Draw n other chains and shuffle them
ismanymodels

Test whether `model` object has many models
censor

Censor missing values and RT outliers
run_many

Fit a Bayesian Model to multiple Participants
iseffective

Four pMCMC checking tools for automatic sampling
isflat

Four pMCMC checking tools for automatic sampling
rplba0

Piecewise LBA model
run

Run model fits
rprior_scalar

Parameter Prior Distributions
run_one

Fit a Bayesian Model to a Single Participant
sumloglike

Calculate Summed, Log-likelihood of a Cognitive Model
theta2mcmclist

Convert theta to a mcmc List
dcauchy_l

A slightly modified dcauchy and rcauchy functions
unstick_one

Unstick posterios samples (One subject)
ggdmc

Bayeisan Computation for Cognitive Models
hsumloglike

Add log-likelihoods across subjects at the hyper level
gelman

Potential scale reduction factor
get_os

Retrieve OS information
likelihood_norm

Calculate log likelihoods
isstuck

Four pMCMC checking tools for automatic sampling
n1PDF_plba0_gpu

A Rcpp connection to the GPU-based PLBA n1PDF type 0 and type 1
rca

Generate random number from a correlated accumulator model
remove_t0

Substract nondecision times from response times
score

Scoring RT data
simulate.model

Simulate RT Data
pairs.model

Correlation Matrix
summary.model

Summarise posterior samples
summary_mcmc_list

Summary statistic for posterior samples
CheckRJ

Check rejection rate
ConvertChains

Prepare posterior samples for plotting version 1
BuildModel

Create a model object
BuildDMI

Bind data and models
GetGamma

Generate a Gamma Vector
GetConstIdx

Whether a hyper-prior distribution is set constant
ConvertChains2

Prepare posterior samples for plotting version 2
BuildPrior

Specifying Parameter Prior Distributions
GetNsim

Get n-cell matrix