ctFitCovCheck: Visual lagged covariance or correlation diagnostics for ctsem fits.

Description

Compares empirical lagged covariances or correlations with the same quantity computed from data generated by the fitted model. Correlations are computed directly from paired observations within each generated data set, then summarised across generated data sets.

Usage

ctFitCovCheck(
  fit,
  cor = TRUE,
  plot = TRUE,
  splitby = NULL,
  data = NULL,
  lags = 0:10,
  variables = NULL,
  splitdata = data,
  split = NULL,
  breaks = 2,
  nsamples = NULL,
  minpairn = 10,
  cores = 1,
  keep = c("summary", "samples")
)

Value

Either a plot/list of plots (default, given by `ctFitCovCheckPlot`) or a `data.table`.

Arguments

fit: A `ctStanFit` object.
cor: Logical; if `TRUE` correlations are analysed instead of covariances.
plot: Logical; if `TRUE` (default) a `ggplot2` object (or list of plots) is returned. If `FALSE`, the raw `data.table` with diagnostics is returned.
splitby: Optional character string giving the variable name on which to split subjects or observations. Numeric variables are split at the subject mean and then at the median by default; non-numeric variables are treated as groups and may vary within subject.
data: Optional long-format data to use for the empirical lagged covariances or correlations. If omitted, the fitted data are used.
lags: Non-negative integer vector giving observation lags to compare.
variables: Optional character vector of manifest variables to include. Defaults to all manifest variables in the model.
splitdata: Optional data containing `splitby`, useful when the split variable was not in the fitted model. If omitted, `data` is used when supplied, otherwise the fitted data are used.
split: Split rule for `splitby`: `"median"`, `"mean"`, `"quantile"`, `"factor"`, or `"none"`. If `NULL`, numeric split variables use `"median"` and non-numeric split variables use `"factor"`.
breaks: Number of quantile groups when `split = "quantile"`.
nsamples: Optional number of generated data sets to use. Existing generated data are reused when present; otherwise `ctGenerateFromFit` is called.
minpairn: Minimum number of complete paired observations required for an empirical or generated lagged covariance/correlation.
cores: Number of cores to use for generating data and summarising generated samples.
keep: If `"samples"`, draw-level model diagnostics are attached as the `"samples"` attribute of the returned data.

Examples

Run this code

# \donttest{
# Lagged correlations for all manifest variables, lags 0 to 3.
ctFitCovCheck(ctstantestfit, cor = TRUE, lags = 0:3, plot = TRUE)

# Lagged covariances for one manifest variable.
ctFitCovCheck(ctstantestfit, cor = FALSE, variables = "Y1", lags = 0:5,
  plot = FALSE)

# Split the diagnostic by a time independent predictor in the fitted data.
gg <- ctFitCovCheck(ctstantestfit, cor = TRUE, splitby = "TI1", lags = 0:3)
print(gg[[1]])

# Use an external subject-level variable for the split.
splitdat <- unique(data.frame(ctstantestdat)[, "id", drop = FALSE])
splitdat$group <- rep(c("a", "b"), length.out = nrow(splitdat))
ctFitCovCheck(ctstantestfit, splitby = "group", splitdata = splitdat,
  lags = 0:2)

# Split observations into early and late periods within each subject.
dat <- data.table::as.data.table(data.frame(ctstantestdat))
dat[, period := ifelse(time <= stats::median(time, na.rm = TRUE),
  "early", "late"), by = id]
ctFitCovCheck(ctstantestfit, data = dat, splitby = "period",
  split = "factor", lags = 0:2)
# }

Run the code above in your browser using DataLab