ctStanFitTV: ctStanFitTV

Description

Fits a ctsem model specified via ctModel with type either 'stanct' or 'standt', using Bayseian inference software Stan, in time varying parameter form - experimental function!

Usage

ctStanFitTV(datalong, ctstanmodel, stanmodeltext = NA, iter = 2000,
  kalman = TRUE, binomial = FALSE, esthyper = TRUE, fit = TRUE,
  stationary = FALSE, plot = FALSE, diffusionindices = "all",
  asymdiffusion = FALSE, optimize = FALSE, vb = FALSE, chains = 1,
  cores = "maxneeded", inits = NULL, initwithoptim = FALSE,
  control = list(adapt_delta = 0.9, adapt_init_buffer = 10, adapt_window = 5,
  max_treedepth = 10, stepsize = 0.001), verbose = FALSE, ...)

Arguments

datalong

long format data containing columns for subject id (numeric values, 1 to max subjects), manifest variables, any time dependent (i.e. varying within subject) predictors, and any time independent (not varying within subject) predictors.

ctstanmodel

model object as generated by ctModel with type='stanct' or 'standt', for continuous or discrete time models respectively.

stanmodeltext

already specified Stan model character string, generally leave NA unless modifying Stan model directly. (Possible after modification of output from fit=FALSE)

iter

number of iterations, half of which will be devoted to warmup by default when sampling. When optimizing, this is the maximum number of iterations to allow -- convergence hopefully occurs before this!

kalman

logical indicating whether or not to integrate over latent states using a Kalman filter. Generally recommended to set TRUE unless using non-gaussian measurement model. If not using Kalman filter, experience suggests that some models / datasets require a relatively high amount of very fast iterations before the sampler is in the high density region. This can make it difficult to determine the number of iterations needed a priori - in such cases setting initwithoptim=TRUE may be helpful.

binomial

logical indicating the use of binomial rather than Gaussian data, as with IRT analyses.

esthyper

Logical indicating whether to explictly estimate distributions for any individually varying parameters, or to fix the distributions to maximum likelihood estimates conditional on subject parameters.

fit

If TRUE, fit specified model using Stan, if FALSE, return stan model object without fitting.

stationary

Logical. If TRUE, T0VAR and T0MEANS input matrices are ignored, the parameters are instead fixed to long run expectations

plot

if TRUE, a Shiny program is launched upon fitting to interactively plot samples. May struggle with many (e.g., > 5000) parameters, and may leave sample files in working directory if sampling is terminated.

diffusionindices

vector of integers denoting which latent variables are involved in covariance calculations. latents involved only in deterministic trends or input effects can be removed from matrices, speeding up calculations. If unsure, leave default of 'all' ! Ignored if kalman=FALSE.

asymdiffusion

if TRUE, increases fitting speed at cost of model flexibility - T0VAR in the model specification is ignored, the DIFFUSION matrix specification is used as the asymptotic DIFFUSION matrix (Q*_inf in the vignette / paper) (making it difficult if not impossible to properly specify higher order processes). The speed increases come about because the internal Kalman filter routine has many steps removed, and the asymptotic diffusion parameters are less dependent on the DRIFT matrix.

optimize

if TRUE, use Stan's optimizer for maximum a posteriori estimates. Setting this also sets esthyper=FALSE

if TRUE, use Stan's variational approximation. Rudimentary testing suggests it is not accurate for many ctsem models at present.

chains

number of chains to sample.

cores

number of cpu cores to use. Either 'maxneeded' to use as many as available, up to the number of chains, or an integer.

inits

vector of parameter start values, as returned by the rstan function unconstrain_pars for instance.

initwithoptim

Logical. If TRUE, the model, with population standard deviations fixed to 1 (so approx 65 for the chains. This can help speed convergence and avoid problematic regions for certain problems.

control

List of arguments sent to stan control argument, regarding warmup / sampling behaviour.

verbose

Logical. If TRUE, prints log probability at each iteration.

...

additional arguments to pass to stan function.

Examples

Run this code

# NOT RUN {
#test data with 2 manifest indicators measuring 1 latent process each, 
# 1 time dependent predictor, 3 time independent predictors
head(ctstantestdat) 

#generate a ctStanModel
model<-ctModel(type='stanct',
n.latent=2, latentNames=c('eta1','eta2'),
n.manifest=2, manifestNames=c('Y1','Y2'),
n.TDpred=1, TDpredNames='TD1', 
n.TIpred=3, TIpredNames=c('TI1','TI2','TI3'),
LAMBDA=diag(2))

#set all parameters except manifest means to be fixed across subjects
model$pars$indvarying[-c(19,20)] <- FALSE

#fit model to data (takes a few minutes - but insufficient 
# iterations and max_treedepth for inference!)
fit<-ctStanFit(ctstantestdat, model, iter=200, chains=2, 
control=list(max_treedepth=6))

#output functions
summary(fit) 

plot(fit)

# }

Run the code above in your browser using DataLab