run.jags: Run or extend a user-specified Bayesian MCMC model in JAGS from within R

Description

Runs or extends a user specified JAGS model from within R, returning an object of class runjags-class.

Usage

run.jags(model, monitor = NA, data = NA, n.chains = NA, inits = NA,
  burnin = 4000, sample = 10000, adapt = 1000, noread.monitor = NULL,
  datalist = NA, initlist = NA, jags = runjags.getOption("jagspath"),
  silent.jags = runjags.getOption("silent.jags"),
  modules = runjags.getOption("modules"),
  factories = runjags.getOption("factories"), summarise = TRUE,
  mutate = NA, thin = 1, keep.jags.files = FALSE,
  tempdir = runjags.getOption("tempdir"), jags.refresh = 0.1,
  batch.jags = silent.jags, method = runjags.getOption("method"),
  method.options = list(), ...)
extend.jags(runjags.object, add.monitor = character(0),
  drop.monitor = character(0), drop.chain = numeric(0),
  combine = length(c(add.monitor, drop.monitor, drop.chain)) == 0,
  burnin = 0, sample = 10000, adapt = 1000, noread.monitor = NA,
  jags = NA, silent.jags = NA, summarise = sample >= 100, thin = NA,
  keep.jags.files = FALSE, tempdir = runjags.getOption("tempdir"),
  jags.refresh = NA, batch.jags = silent.jags, method = NA,
  method.options = NA, ...)

Arguments

model

either a relative or absolute path to a textfile (including the file extension) containing a model in the JAGS language and possibly monitored variable names, data and/or initial values, or a character string of the same. No default. See

monitor

a character vector of the names of variables to monitor. No default. The special node names 'deviance', 'pd', 'popt', 'dic', 'ped' and 'full.pd' are used to monitor the deviance, mean pD, mean pOpt, DIC, PED and full distribution of sum(pD) respectively

data

a named list, data frame, environment, character string in the R dump format (see dump.format), or a function (with no arguments) returning one of these types. If the model text contains inline #data#

n.chains

the number of chains to use with the simulation. More chains will improve the sensitivity of the convergence diagnostic, but will cause the simulation to run more slowly (although this may be improved by using a method such as 'parallel', 'rjparallel' or

inits

either a character vector with length equal to the number of chains the model will be run using, or a list of named lists representing names and corresponding values of inits for each chain, or a function with either 1 argument representing the chain or

burnin

the number of burnin iterations, NOT including the adaptive iterations to use for the simulation. Note that the default is 4000 plus 1000 adaptive iterations, with a total of 5000.

sample

the total number of (additional) samples to take. Default 10000 iterations. If specified as 0, then the model will be created and returned without any MCMC samples (burnin and adapt will be ignored). Note that a minimum of 100 samples is required to ge

adapt

the number of adaptive iterations to use at the start of the simulation. If the adaptive phase is not long enough, the sampling efficiency of the MCMC chains will be compromised. If the model does not require adaptation (either because a compiled rjags

noread.monitor

an optional character vector of variables to monitor in JAGS and output to coda files, but that should not be read back into R. This may be useful (in conjunction with keep.jags.files=TRUE) for looking at large numbers of variables a few at a time using

datalist

deprecated argument.

initlist

deprecated argument.

jags

the system call or path for activating JAGS. Default uses the option given in runjags.options.

silent.jags

option to suppress output of the JAGS simulations. Default uses the option given in runjags.options.

modules

a character vector of external modules to be loaded into JAGS, either as the module name on its own or as the module name and status separated by a space, for example 'glm on'.

factories

a character vector of factory modules to be loaded into JAGS. Factories should be provided in the format ' ' (where status is optional), for example: factories='mix::TemperedMix sampler on'. You must also ensure that any requi

summarise

should summary statistics be automatically calculated for the output chains? Default TRUE.

mutate

either a function or a list with first element a function and remaining elements arguments to this function. This can be used to add new variables to the posterior chains that are derived from the directly monitored variables in JAGS. This allows the var

thin

the thinning interval to be used in JAGS. Increasing the thinning interval may reduce autocorrelation, and therefore reduce the number of samples required, but will increase the time required to run the simulation. Using this option thinning is performe

keep.jags.files

option to keep the folder with files needed to call JAGS, rather than deleting it. This allows the simulation results to be re-read using results.jags(path-to-folder), even from another R session, and may also be useful for attempting to bug fix models.

tempdir

option to use the temporary directory as specified by the system rather than creating files in the working directory. If keep.jags.files=TRUE then the folder is copied to the working directory after the job has finished (with a unique folder name based o

jags.refresh

the refresh interval (in seconds) for monitoring JAGS output using the 'interactive' and 'parallel' methods (see the 'method' argument). Longer refresh intervals will use slightly less processor time, but will make the simulation updates to be shown on t

batch.jags

option to call JAGS in batch mode, rather than using input redirection. On JAGS >= 3.0.0, this suppresses output of the status which may be useful in some situations. Default TRUE if silent.jags is TRUE, or FALSE otherwise.

method

the method with which to call JAGS; probably a character vector specifying one of 'rjags', 'simple', 'interruptible', 'parallel', 'rjparallel', 'background', 'bgparallel' or 'snow'. The 'rjags' and 'rjparallel' methods run JAGS using the rjags package, wh

method.options

a deprecated argument currently permitted for backwards compatibility, but this will be removed from a future version of runjags. Pass these arguments directly to run.jags or extend.jags.

...

summary parameters to be passed to add.summary, and/or additional options to control some methods including n.sims for parallel methods, cl for rjparallel and snow methods, remote.jags for snow, and by

runjags.object

the model to be extended - the output of a run.jags (or autorun.jags or extend.jags etc) function, with class 'runjags'. No default.

add.monitor

a character vector of variables to add to the monitored variable list. All previously monitored variables are automatically included - although see the 'drop.monitor' argument. Default no additional monitors.

drop.monitor

a character vector of previously monitored variables to remove from the monitored variable list for the extended model. Default none.

drop.chain

a numeric vector of chains to remove from the extended model. Default none.

combine

a logical flag indicating if results from the new JAGS run should be combined with the previous chains. Default TRUE if not adding or removing variables or chains, and FALSE otherwise.

Value

Usually an object of class 'runjags', or an object of class 'runjagsbginfo' for background methods (see runjags-class).

Details

The run.jags function reads, compiles, and updates a JAGS model based on a model representation (plus data, monitors and initial values) input by the user. The model can be contained in an external text file, or a character vector within R. The autorun.jags function takes an existing runjags-class object and extends the simulation. Running a JAGS model using these functions has two main advantages:

1) The method used to call or extend the simulation can be changed simply using the method option. The methods most likely to be used are 'interruptible' and 'rjags' which use one simulation per model, or 'parallel', 'bgparallel' and 'rjparallel' which run a separate simulation for each chain to speed up the model run. For more details see below under the 'method' argument.

2) All information required to re-run the simulations is stored within the runjags-class object returned. This complete representation can be written to a text file using write.jagsfile, then modified as necessary and re-run using only the file path as input.

3) Summary statistics for the returned simulations are automatically calculated and displayed using associated S3 methods intended to facilitate checking model convergence and run length. Additional methods are available for plot functions, as well as conversion to and from MCMC and rjags objects. See the help file for runjags-class for more details.

Examples

Run this code

runjags.options(new.windows=FALSE)
# run a model to calculate the intercept and slope of the expression
# y = m x + c, assuming normal observation errors for y:

# Simulate the data
X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)

# Model in the JAGS format
model <- "model {
for(i in 1 : N){
	Y[i] ~ dnorm(true.y[i], precision);
	true.y[i] <- (m * X[i]) + c
}
m ~ dunif(-1000,1000)
c ~ dunif(-1000,1000)
precision ~ dexp(1)
}"

# Data and initial values in a named list format,
# with explicit control over the random number
# generator used for each chain (optional):
data <- list(X=X, Y=Y, N=length(X))
inits1 <- list(m=1, c=1, precision=1,
.RNG.name="base::Super-Duper", .RNG.seed=1)
inits2 <- list(m=0.1, c=10, precision=1,
.RNG.name="base::Wichmann-Hill", .RNG.seed=2)

# Run the model and produce plots
results <- run.jags(model=model, monitor=c("m", "c", "precision"),
data=data, n.chains=2, method="rjags", inits=list(inits1,inits2))

# Standard plots of the monitored variables:
plot(results)

# Look at the summary statistics:
print(results)

# Extract only the coefficient as an mcmc.list object:
library('coda')
coeff <- as.mcmc.list(results, vars="m")


# The same model but using embedded shortcuts to specify data, inits and monitors,
# and using parallel chains:

# Model in the JAGS format

model <- "model {
for(i in 1 : N){ #data# N
	Y[i] ~ dnorm(true.y[i], precision) #data# Y
	true.y[i] <- (m * X[i]) + c #data# X
}
m ~ dunif(-1000,1000) #inits# m
c ~ dunif(-1000,1000)
precision ~ dexp(1)
#monitor# m, c, precision
}"

# Simulate the data
X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)
N <- length(X)

initfunction <- function(chain) return(switch(chain,
	"1"=list(m=-10), "2"=list(m=10)))

# Run the 2 chains in parallel (allowing the run.jags function
# to control the number of parallel chains). We also use a
# mutate function to convert the precision to standard deviation:
results <- run.jags(model, n.chains=2, inits=initfunction,
method="parallel", mutate=list("prec2sd", vars="precision"))

# View the results using the standard print method:
results

# Look at some plots of the intercept and slope on a 3x3 grid:
plot(results, c('trace','histogram','ecdf','crosscorr','key'),
vars=c("m","^c"),layout=c(3,3))

# Write the current model representation to file:
write.jagsfile(results, file='mymod.txt')
# And re-run the simulation from this point:
newresults <- run.jags('mymod.txt')
# Run the same model using 8 chains in parallel:
# distributed computing cluster:
# A list of 8 randomly generated starting values for m:
initlist <- replicate(8,list(m=runif(1,-20,20)),simplify=FALSE)

# Run the chains in parallel using JAGS (2 models
# with 4 chains each):
results <- run.jags(model, n.chains=8, inits=initlist,
method="parallel", n.sims=2)

# Set up a distributed computing cluster with 2 nodes:
library(parallel)
cl <- makeCluster(4)

# Run the chains in parallel rjags models (4 models
# with 2 chains each) on this cluster:
results <- run.jags(model, n.chains=8, inits=initlist,
method="rjparallel", cl=cl)

stopCluster(cl)

# For more examples see the quick-start vignette:
vignette('quickjags', package='runjags')

# And for more details about possible methods see:
vignette('userguide', package='runjags')

Run the code above in your browser using DataLab