ctStanFit: ctStanFit

Description

Fits a ctsem model specified via ctModel with type either 'stanct' or 'standt', using Bayseian inference software Stan.

Usage

ctStanFit(datalong, ctstanmodel, stanmodeltext = NA, iter = 1000,
  intoverstates = TRUE, binomial = FALSE, fit = TRUE, ukfpop = FALSE,
  stationary = FALSE, plot = FALSE, derrind = "all", optimize = FALSE,
  isloops = 10, isloopsize = 500, issamples = 5000, nopriors = FALSE,
  vb = FALSE, chains = 1, cores = "maxneeded", inits = NULL,
  maxtimestep = 9999, lineardynamics = "auto", forcerecompile = FALSE,
  control = list(adapt_delta = 0.8, adapt_init_buffer = 2, adapt_window = 2,
  max_treedepth = 10, stepsize = 0.001), verbose = 0, ...)

Arguments

datalong

long format data containing columns for subject id (numeric values, 1 to max subjects), manifest variables, any time dependent (i.e. varying within subject) predictors, and any time independent (not varying within subject) predictors.

ctstanmodel

model object as generated by ctModel with type='stanct' or 'standt', for continuous or discrete time models respectively.

stanmodeltext

already specified Stan model character string, generally leave NA unless modifying Stan model directly. (Possible after modification of output from fit=FALSE)

iter

number of iterations, half of which will be devoted to warmup by default when sampling. When optimizing, this is the maximum number of iterations to allow -- convergence hopefully occurs before this!

intoverstates

logical indicating whether or not to integrate over latent states using a Kalman filter. Generally recommended to set TRUE unless using non-gaussian measurement model.

binomial

Deprecated. Logical indicating the use of binary rather than Gaussian data, as with IRT analyses. This now sets intoverstates = FALSE and the manifesttype of every indicator to 1, for binary.

fit

If TRUE, fit specified model using Stan, if FALSE, return stan model object without fitting.

ukfpop

if TRUE, uses an unscented approximation for population distributions rather than full sampling. Allows for optimization of non-linearities and random effects.

stationary

Logical. If TRUE, T0VAR and T0MEANS input matrices are ignored, the parameters are instead fixed to long run expectations. More control over this can be achieved by instead setting parameter names of T0MEANS and T0VAR matrices in the input model to 'stationary', for elements that should be fixed to stationarity.

plot

if TRUE, a Shiny program is launched upon fitting to interactively plot samples. May struggle with many (e.g., > 5000) parameters, and may leave sample files in working directory if sampling is terminated.

derrind

vector of integers denoting which latent variables are involved in dynamic error calculations. latents involved only in deterministic trends or input effects can be removed from matrices (ie, that obtain no additional stochastic inputs after first observation), speeding up calculations. If unsure, leave default of 'all' ! Ignored if intoverstates=FALSE.

optimize

if TRUE, use Stan's optimizer for maximum a posteriori estimates.

isloops

Only relevent if optimize=TRUE. Number of iterations of adaptive importance sampling to perform after optimization.

isloopsize

Only relevent if optimize=TRUE. Number of samples per iteration of importance sampling.

issamples

Number of samples to use for final results of importance sampling.

nopriors

logical. If TRUE, any priors are disabled -- sometimes desirable for optimization.

if TRUE, use Stan's variational approximation. Rudimentary testing suggests it is not accurate for many ctsem models at present.

chains

number of chains to sample, during HMC or post-optimization importance sampling.

cores

number of cpu cores to use. Either 'maxneeded' to use as many as available, up to the number of chains, or a positive integer.

inits

vector of parameter start values, as returned by the rstan function unconstrain_pars for instance.

maxtimestep

positive numeric, only used for models with non-linear dynamics, specifying the largest time span covered by the Runge-Kutta 4 integration. The large default ensures that for each observation time interval, only RK4 integration is used. When maxtimestep is smaller than the observation time interval, RK4 integration is used within an Euler loop. Smaller values may offer greater accuracy, but are slower and often unnecessary. In case of initial value problems, reducing this is one thing to try.

lineardynamics

either character string "auto" or a logical. Set to TRUE to force linear dynamics, FALSE to use non-linear integration. "auto" attempts to select the appropriate choice.

forcerecompile

logical. For development purposes. If TRUE, stan model is recompiled, regardless of apparent need for compilation.

control

List of arguments sent to stan control argument, regarding warmup / sampling behaviour.

verbose

Integer from 0 to 2. Higher values print more information during model fit -- for debugging.

...

additional arguments to pass to stan function.

Examples

Run this code

# NOT RUN {
#test data with 2 manifest indicators measuring 1 latent process each, 
# 1 time dependent predictor, 3 time independent predictors
head(ctstantestdat) 

#generate a ctStanModel
model<-ctModel(type='stanct',
n.latent=2, latentNames=c('eta1','eta2'),
n.manifest=2, manifestNames=c('Y1','Y2'),
n.TDpred=1, TDpredNames='TD1', 
n.TIpred=3, TIpredNames=c('TI1','TI2','TI3'),
LAMBDA=diag(2))

#set all parameters except manifest means to be fixed across subjects
model$pars$indvarying[-c(19,20)] <- FALSE

#fit model to data (takes a few minutes - but insufficient 
# iterations and max_treedepth for inference!)
fit<-ctStanFit(ctstantestdat, model, iter=200, chains=2, 
control=list(max_treedepth=6))

#output functions
summary(fit) 

plot(fit)

# }

Run the code above in your browser using DataLab