xgrid.run.jags: Run a JAGS Model using an Xgrid distributed computing cluster from Within R

Description

Extends the functionality of the (auto)run.jags(file) family of functions to use with Apple Xgrid distributed computing clusters. Jobs can either be run synchronously using xgrid.(auto)run.jags(file) in which case the process will wait for the model to complete before returning the results, or asynchronously using xgrid.submit.jags(file) in which case the process will terminate on submission of the job and results are retrieved at a later time using xgrid.results.jags. The latter function can also be used to check the progress of incomplete simulations without stopping or retrieving the full job. Access to an Xgrid cluster with JAGS (although not necessarily R) installed is required. Due to the dependance on Xgrid software to perform the underlying submission and retrieval of jobs, these functions can only be used on machines running Mac OS X. Further details of required environmental variables and the optional mgrid script to enable multi-task jobs can be found in the details section.

Usage

xgrid.run.jags(wait.interval="10 min", xgrid.method='simple',
jagspath='/usr/local/bin/jags', jobname=NA, cleanup=TRUE,
sub.app=if(!file.exists(Sys.which('mgrid'))) 
'xgrid -job submit -in "$indir"'
else 'mgrid -t $ntasks -i "$indir"', sub.options="", 
sub.command=paste(sub.app, sub.options, '"$cmd"', 
sep=' '), ...)
xgrid.run.jagsfile(wait.interval="10 min", xgrid.method='simple',
jagspath='/usr/local/bin/jags', jobname=NA, cleanup=TRUE,
sub.app=if(!file.exists(Sys.which('mgrid'))) 
'xgrid -job submit -in "$indir"'
else 'mgrid -t $ntasks -i "$indir"', sub.options="", 
sub.command=paste(sub.app, sub.options, '"$cmd"', 
sep=' '), ...)
xgrid.autorun.jags(wait.interval="10 min", xgrid.method='simple',
jagspath='/usr/local/bin/jags', jobname=NA, cleanup=TRUE,
sub.app=if(!file.exists(Sys.which('mgrid'))) 
'xgrid -job submit -in "$indir"'
else 'mgrid -t $ntasks -i "$indir"', sub.options="", 
sub.command=paste(sub.app, sub.options, '"$cmd"', 
sep=' '), ...)
xgrid.autorun.jagsfile(wait.interval="10 min", xgrid.method='simple',
jagspath='/usr/local/bin/jags', jobname=NA, cleanup=TRUE,
sub.app=if(!file.exists(Sys.which('mgrid'))) 
'xgrid -job submit -in "$indir"'
else 'mgrid -t $ntasks -i "$indir"', sub.options="", 
sub.command=paste(sub.app, sub.options, '"$cmd"', 
sep=' '), ...)
xgrid.submit.jags(xgrid.method='simple', jagspath='/usr/local/bin/jags',
jobname=NA, sub.app=if(!file.exists(Sys.which('mgrid'))) 
'xgrid -job submit -in "$indir"' else 'mgrid -t $ntasks -i "$indir"', 
sub.options="", sub.command=paste(sub.app, sub.options, '"$cmd"', 
sep=' '), ...)

xgrid.submit.jagsfile(xgrid.method='simple',
jagspath='/usr/local/bin/jags',
jobname=NA, sub.app=if(!file.exists(Sys.which('mgrid'))) 
'xgrid -job submit -in "$indir"' else 'mgrid -t $ntasks -i "$indir"', 
sub.options="", sub.command=paste(sub.app, sub.options, '"$cmd"', 
sep=' '), ...)
xgrid.results.jags(jobname, cleanup=TRUE, ...)

Arguments

wait.interval

when running xgrid jobs synchronously, the waiting time between retrieving the status of the job. If the job is found to be finished on retrieving the status then results are returned, otherwise the function waits for 'wait.interval' before repeating the

xgrid.method

the method of submitting the simulation to Xgrid - one of 'simple', 'separatejobs' or 'separatetasks'. The former runs all chains on a single node, whereas 'separatejobs' runs all chains as individual xgrid jobs and 'separatetasks' runs all chains as ind

method

the method with which to call JAGS; one of 'simple', 'interruptible' or 'parallel'. The former runs JAGS as a foreground process (the default behaviour for runjags < 0.9.6), 'interruptible' allows the JAGS process to be terminated immediately using the i

jagspath

the path to the JAGS executable on the xgrid machines. Note that /usr/local/bin is not included in the path when running Xgrid jobs, so it is safer to provide the full path. If not all machines on the xgrid cluster have JAGS installed then it is possible

jobname

for all functions except xgrid.results.jags, the jobname can be provided to make identification of the job using Xgrid Admin easier. If none is provided, then one is generated using a combination of the username and hostname of the submitting machine. I

cleanup

option to delete the job(s) from Xgrid after retrieving result. Default TRUE.

sub.app

the submission application or script to use for job running/submission. The inbuilt Xgrid application supports most options, but greater functionality is provided by the mgrid script (see the details section for more information and installation instruct

sub.options

one or more option flags to be passed through to the submission application (as a character string). Examples include ART scripts, email on job completion, and when using the mgrid script many other possibilities (see the details section). When providin

sub.command

the actual command to be executed using system() to submit the job. Changing this results in sub.app and sub.options being ignored, and is probably the best option to use for custom submission scripts (see the sub.app argument for the requirements for cu

...

other options to be passed to the (auto)run.jags(file) functions as if the model were being run locally. The following options to be applied after running the simulation can be specified to xgrid.results.jags, and will be ignored for other functions: ke

Value

For xgrid.submit.jags and xgrid.submit.jagsfile, a list containing the jobname (which will be required by xgrid.results.jags to retrieve the job) and the job ID(s) for use with the xgrid command line facilities. For all other functions, the results of the simulation are returned as with the respective (auto)run.jags(file) functions.

Details

These functions allow JAGS models to be run on Xgrid distributed computing clusters from within R using the same syntax as required to run the models locally. All the functionality could be replicated by saving all necessary objects to files and using the Xgrid command line utility to submit and retrieve the job manually; these functions merely provide the convenience of not having to do this manually. Xgrid support is only available on Mac OS X machines.

The xgrid controller hostname and password must be set as environmental variables. The command line version of R knows about environmental variables set in the .profile file, but unfortunately the GUI version does not and requires them to be set from within R using:

Sys.setenv(XGRID_CONTROLLER_HOSTNAME="")

Sys.setenv(XGRID_CONTROLLER_PASSWORD="")

(These lines could be copied into your .Rprofile file for a 'set and forget' solution)

All functions can be run using the built-in xgrid commands, however some added functionality (including multi-tasks jobs to enable the 'separatetasks' method) is provided by the 'mgrid.sh' BASH shell script which is included with the runjags package (in the 'inst/xgrid' folder for the package source or the 'xgrid' folder for the installed package). More details about this script is given at the top of the mgrid.sh file. To install (optional), see the install.mgrid function.

Examples

Run this code

# run a simple model on Xgrid using a single job:

# Ensure the required environmental variables are set:
Sys.setenv(XGRID_CONTROLLER_HOSTNAME="<hostname>")
Sys.setenv(XGRID_CONTROLLER_PASSWORD="<password>")

# Simulate the data
X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)

# Model in the JAGS format
model <- "model {
for(i in 1 : N){
Y[i] ~ dnorm(true.y[i], precision);
true.y[i] <- (m * X[i]) + c;
}
m ~ dunif(-1000,1000);
c ~ dunif(-1000,1000);
precision ~ dexp(1);
}"

# Run the model synchronously using the 'simple' method 
# and a wait interval of 1 minute:
results <- xgrid.run.jags(xgrid.method='simple', 
	wait.interval='1 min', model=model, monitor=c("m", "c", 
	"precision"), data=list(N=length(X), X=X, Y=Y), n.chains=2, 
	plots = FALSE)

# Analyse the results:
results$summary


# Submit a job to xgrid and (later) retrieve the results.  Use an 
# ART script to ensure the job is only sent to nodes with JAGS installed:

# Ensure the required environmental variables are set:
Sys.setenv(XGRID_CONTROLLER_HOSTNAME="<hostname>")
Sys.setenv(XGRID_CONTROLLER_PASSWORD="<password>")

# Create the ART script we need to ensure JAGS is installed:
cat('#!/bin/bash
if [ -f /usr/local/bin/jags ]; then 
echo 1
else 
echo 0
fi
', file='jagsART.sh')

# Simulate the data
X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)

# Model in the JAGS format
model <- "model {
for(i in 1 : N){
Y[i] ~ dnorm(true.y[i], precision);
true.y[i] <- (m * X[i]) + c;
}
m ~ dunif(-1000,1000);
c ~ dunif(-1000,1000);
precision ~ dexp(1);
}"

# Run the model asynchronously (the ART script path must 
# be specified as an absolute link as xgrid won't be called 
# in the current working directory, and all paths must be 
# enclosed in quotes to preserve spaces):
name <- xgrid.submit.jags(xgrid.method='separatejobs',
sub.options=if(!file.exists(Sys.which('mgrid'))) paste('-art
"', getwd(), '/jagsART.sh"', sep='') else paste('-a "', getwd(),
'/jagsART.sh"', sep=''), model=model, monitor=c("m", "c", "precision"),
data=list(N=length(X), X=X, Y=Y), n.chains=2, plots = FALSE,
inits=list(list(.RNG.name='base::Wichmann-Hill'), 
list(.RNG.name='base::Marsaglia-Multicarry')))

# Cleanup (remove jagsART file):
unlink('jagsART.sh')

# Retrieve the results:
results <- xgrid.results.jags(name)



# Autorun a model to convergence using separate tasks on xgrid.  
# Ensure the tasks are sent to the 2 fastest nodes (called 'Bugati' 
# and 'McLaren') in our (fictional) cluster using arguments to mgrid.

# Ensure the required environmental variables are set:
Sys.setenv(XGRID_CONTROLLER_HOSTNAME="<hostname>")
Sys.setenv(XGRID_CONTROLLER_PASSWORD="<password>")

# Ensure mgrid is installed:
if(!file.exists(Sys.which('mgrid'))) install.mgrid()

# Simulate the data
X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)

# Model in the JAGS format
model <- "model {
for(i in 1 : N){
Y[i] ~ dnorm(true.y[i], precision);
true.y[i] <- (m * X[i]) + c;
}
m ~ dunif(-1000,1000);
c ~ dunif(-1000,1000);
precision ~ dexp(1);
}"

# Run the model synchronously using the 'separatetasks' method and 
# a wait interval of 1 minute:
results <- xgrid.autorun.jags(xgrid.method='separatetasks', 
	wait.interval='1 min', sub.options='-h "Bugati:McLaren"', 
	model=model, monitor=c("m", "c", "precision"), 
	data=list(N=length(X), X=X, Y=Y), n.chains=2, 
	inits=list(list(.RNG.name='base::Wichmann-Hill'), 
	list(.RNG.name='base::Marsaglia-Multicarry')), plots = FALSE)

Run the code above in your browser using DataLab