Learn R Programming

⚠️There's a newer version (0.4.5) of this package.Take me there.

gmvjoint

What is gmvjoint?

  • "generalised"
  • multivariate
  • Joint models.

gmvjoint allows the user to fit joint models of survival and multivariate longitudinal data, where the longitudinal sub-models are specified by generalised linear mixed models (GLMMs). The joint models are fit via maximum likelihood using an approximate EM algorithm first proposed by Bernhardt et al. (2015). The GLMMs are specified using the same syntax as for package glmmTMB (Brooks et al., 2017). The joint models themselves are then the flexible extensions to those in e.g. Wulfsohn and Tsiatis (1997). The user is able to simulate data under many different response types.

Currently, six families can be fit: Gaussian; Poisson; binomial; Gamma; negative binomial; and generalised Poisson.

Installation

You can install the latest 'official' release from CRAN in the usual way:

install.packages('gmvjoint')

or the latest development version using devtools:

devtools::install_github('jamesmurray7/gmvjoint')

MacOS users may be interested in swapping their BLAS library to one which provides an optimal BLAS implementation for Mac hardware (vecLib).

Example

To fit a joint model, we first need to specify the longitudinal and survival sub-models.

The longitudinal sub-model must be a list which contains the specification of the longitudinal process along with its random effects structure in the same syntax as a glmmTMB model (which itself is the same as the widely-used lme4). As an example, suppose we want to fit a trivariate model on the oft-used PBC data, with a linear time-drug interaction term on albumin, a spline term on (logged) serum bilirubin and a linear fit on spiders, we specify

data(PBC)
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'serBilir', 'albumin', 'spiders'))
PBC <- na.omit(PBC) 
long.formulas <- list(
  albumin ~ drug * time + (1 + time|id),
  log(serBilir) ~ drug * splines::ns(time, 3) + (1 + splines::ns(time, 3)|id),
  spiders ~ drug * time + (1|id)
)

where we note interactions and spline-time fits are possible.

The survival sub-model must be set-up using Surv() from the survival package e.g.

surv.formula <- Surv(survtime, status) ~ drug

Currently interaction terms in the survival sub-model specification are unsupported.

Now we can do the joint model call through the main workhorse function joint. This notably take a list of family arguments which must match-up in the desired order as the longitudinal process list. We then fit our joint model via

fit <- joint(long.formulas = long.formulas, surv.formula = surv.formula, data = PBC, 
             family = list("gaussian", "gaussian", "binomial"))
summary(fit)

where extra control arguments are documented in ?joint. For certain families, we could additionally supply disp.formulas which specify the dispersion model for the corresponding longitudinal process. Numerous S3 methods exist for the class of object joint creates: summary(), logLik(), fixef(), ranef(), fitted(), resid(), and vcov(). LaTeX-ready tables can also be generated by S3 method xtable(). Data can be simulated under a host of different parameter set-ups using the simData() function.

We bridge from a set of joint model parameter estimates to a prognostic one by dynamic predictions dynPred. We can assess discriminatory capabilities of the joint() model fit by the ROC function, too.

To-do list

Currently the largest limitation exists with the relatively strict data structure necessary and the corresponding calls to the joint function. The below lists these (known) limitations and plans for relaxing.

  • Longitudinal information: The longitudinal time argument must be named time and the subject identifier (which we 'split' random effects by) id. Unsure if I will ever change these; I think a little more user pre-processing is no bad thing, when alternative would be a more crowded call to joint, which I wouldn't be a fan of.
  • Misc.: data must be balanced (i.e. no NA values); this will be fixed in a future update. For now I don't think this is the biggest issue, and recommend using na.omit for example. Additionally, the id variable must increment by no more than one. That is, data$id=1,1,1,2,2,2,3,3,3 is fine, but data$id=1,1,1,1,3,3,3,4,4 is not. This is due to how data matrices are created internally and will be fixed in the future.

Update June 2023: Since calls and data are passed straight to glmmTB, possible to assign these unique indices 1:n, nothing currently done with this, though.

  • Pretty-ing of progress bars using cli package (maybe...).

Note I'm a PhD student, and the S3 methods (and some functions themselves) have largely arisen out of things I needed, or thought would be a good idea at some point!

References

Bernhardt PW, Zhang D and Wang HJ. A fast EM Algorithm for Fitting Joint Models of a Binary Response to Multiple Longitudinal Covariates Subject to Detection Limits. Computational Statistics and Data Analysis 2015; 85; 37--53

Mollie E. Brooks, Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler and Benjamin M. Bolker (2017). glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal, 9(2), 378-400.

Murray, J and Philipson P. A fast approximate EM algorithm for joint models of survival and multivariate longitudinal data. Computational Statistics and Data Analysis 2022

Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997; 53(1), 330-339.

Copy Link

Version

Install

install.packages('gmvjoint')

Monthly Downloads

199

Version

0.4.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

James Murray

Last Published

December 21st, 2023

Functions in gmvjoint (0.4.0)

joint

Fit a joint model to time-to-event and multivariate longitudinal data
plot.cond.b.joint

Plot posterior distribution of the random effects for a joint model.
logLik.joint

Log-likelihood for joint model.
ranef.joint

Extract random effects from a joint object.
parseCoxph

Parsing the survival formula and constructing all survival-related data objects.
plot.residuals.joint

Plot joint model residuals
residuals.joint

Obtain joint model residuals
plot.ROC.joint

Plot receiver operator characteristics.
joint.object

Fitted joint object
plot.dynPred

Plot conditional survival probabilities.
xtable.joint

Print an LaTeX-ready xtable for a joint object.
vcov.joint

Extract the variance-covariance matrix from a joint fit.
summary.joint

Summary of an joint object.
rgenpois

Simulate realisations from a generalised poisson distribution
simData

Simulate data from a multivariate joint model
ROC

Receiver Operator Characteristics (ROC) for a joint model.
fitted.joint

Obtain joint model fitted values
dynPred

Dynamic predictions for survival sub-model in a multivariate joint model.
PBC

Primary biliary cirrhosis data
boot.joint

Bootstrapping a joint object
extractAIC.joint

Extract AIC from a joint model fit.
cond.ranefs

Obtain conditional distribution of the random effects
fixef.joint

Extract fixed effects from a joint object.
anova.joint

Anova for joint models
gmvjoint

tools:::Rd_package_title("gmvjoint")