Function to allow a one stage joint model (data from all studies analysed in one model) to be fitted to data from multiple studies. The function allows one longitudinal and one time-to-event outcome, and can accommodate baseline hazard stratified or not stratified by study, as well as random effects at the individual level and the study level. Currently only zero mean random effects only proportional association supported - see Wulfsohn and Tsiatis 1997
jointmeta1(
data,
long.formula,
long.rand.ind,
long.rand.stud = NULL,
sharingstrct = c("randprop", "randsep", "value", "slope", "valandslope"),
surv.formula,
gpt,
lgpt,
max.it,
tol,
study.name,
strat = F,
longsep = F,
survsep = F,
bootrun = F,
print.detail = F
)an object of class jointdata containing the variables named in the model formulae
a formula object with the response varaible, and the covariates to include in the longitudinal sub-model
a vector of character strings to indicate what variables
to assign individual level random effects to. A maximum of three
individual level random effects can be assigned. To assign a random
intercept include 'int' in the vector. To not include an individual level
random intercept include 'noint' in the vector. For example to fit a model
with individual level random intercept and random slope set
long.rand.ind = c('int', 'time'), where 'time' is the
longitudinal time variable in the data.
a vector of character strings to indicate what
variables to assign study level random effects to. If no study level
random effects then this either not specified in function call or set to
NULL. If a study level random intercept is required, include the
name of the study membership variable for example long.rand.stud =
'study'.
currently must be set to 'randprop'. This gives a
model that shares the zero mean random effects (at both individual and
study level if specified) between the sub-models. Separate association
parameters are calculated for the linear combination of random effects at
each level. There are plans to expand to more sharing structures in the
future.
a formula object with the survival time, censoring
indicator and the covariates to include in the survival sub-model. The
response must be a survival object as returned by the
Surv function.
the number of quadrature points across which the integration with
respect to the random effects will be performed. If random effects are
specified at both the individual and the study level, the same number of
quadrature points is used in both cases. Defaults to gpt = 5.
the number of quadrature points which the log-likelihood is
evaluated over following a model fit. This defaults to lgpt = 7.
the maximum number of iterations of the EM algorithm that the
function will perform. Defaults to max.it = 350 although more
iterations could be required for large complex datasets.
the tolerance level used to determine convergence in the EM
algorithm. Defaults to tol = 0.001.
a character string denoting the name of the variable in the
baseline dataset in data holding study membership, for example
study.name = 'study'.
logical value: if TRUE then the survival sub-model is
calculated with a baseline stratified by study. Otherwise baseline is
unstratified
logical value: if TRUE then parameter estimates, model
fit and the log-likelihood from a separate linear mixed model analysis of
the longitudinal data are returned (see the lmer
function). The separate longitudinal model fit has the same specification
as the longitudinal sub-model of the joint model.
logical value: if TRUE then parameter estimates, model
fit and log-likelihood from a separate analysis of the survival data using
the Cox Proportional Hazards model are returned (see
coxph function for more details). This survival
fit has the same specification (apart from the association structure) as
the survival sub-model in the joint model.
logical value: if TRUE then the log-likelihood for the
model is not calculated. This option is available so that when
bootstrapping to obtain standard errors, as the log-likelihood is not
needed, it is not calculated, thus speeding up the bootstrapping process.
logical value: if TRUE then details of the
parameter estimates at each iteration of the EM algorithm are printed to
the console.
An object of class jointmeta1 See jointmeta1.object
The jointmeta1 function fits a one stage joint model
to survival and longitudinal data from multiple studies. This model is an
extension of the model proposed by Wulfsohn and Tsiatis (1997). The model
must contain at least one individual level random effect (specified using
the long.rand.ind argument). The model can also contain study level
random effects (specified using the long.rand.stud argument), which
can differ from the individual level random effects. The maximum number of
random effects that can be specified at each level is three. Note that the
fitting and bootstrapping time increases as the number of included random
effects increases. The model can also include a baseline hazard stratified
by study, or can utilise a common baseline across the studies in the
dataset. Interaction terms can be specified in either the longitudinal or
the survival sub-model.
The longitudinal sub-model is a mixed effects model. If both individual level and study level random effects are included in the function call, then the sub-model has the following format:
$$Y_{kij} = X_{1kij}\beta_{1} + Z^{(2)}_{1kij}b^{(2)}_{ki} + Z^{(3)}_{1kij}b^{(3)}_{k} + \epsilon_{kij}$$
Otherwise, if only individual level random effects are included in the function call, then the longitudinal sub-model has the following format:
$$Y_{kij} = X_{1kij}\beta_{1} + Z^{(2)}_{1kij}b^{(2)}_{ki} + \epsilon_{kij}$$
In the above equation, \(Y\) represents the longitudinal outcome and \(X_1\) represents the design matrix for the longitudinal fixed effects. The subscript 1 is used to distinguish between items from the longitudinal sub-model and items from the survival sub-model (which contain a subscript 2). The design matrices for random effects are represented using \(Z\), fixed effect coefficients are represented by \(\beta\), random effects by \(b\) and the measurement error by \(\epsilon\). Study membership is represented by the subscript \(k\) whilst individuals are identified by \(i\) and time points at which they are measured by \(j\). The longitudinal outcome is assumed continuous.
Currently this function only supports one linking structure between the sub-models, namely a random effects only proportional sharing structure. In this structure, the zero mean random effects from the longitudinal sub-model are inserted into the survival sub-model, with a common association parameter for each level of random effects. Therefore the survival sub-model (for a case without baseline stratified by study) takes the following format:
$$\lambda_{ki}(t) = \lambda_{0}(t)exp(X_{2ki}\beta_{2} + \alpha^{(2)}(Z^{(2)}_{1ki}b^{(2)}_{ki}) + \alpha^{(3)}(Z^{(3)}_{1ki}b^{(3)}_{k})) $$
Otherwise, if only individual level random effects are included in the function call, this reduces to:
$$\lambda_{ki}(t) = \lambda_{0}(t)exp(X_{2ki}\beta_{2} + \alpha^{(2)}(Z^{(2)}_{1ki}b^{(2)}_{ki}) $$
In the above equation, \(\lambda_{ki}(t)\) represents the survival time of the individual \(i\) in study \(k\), and \(\lambda_{0}(t)\) represents the baseline hazard. If a stratified baseline hazard were specified this would be replaced by \(\lambda_{0k}(t)\). The design matrix for the fixed effects in the survival sub-model is represented by \(X_{2ki}\), with fixed effect coefficients represented by \(\beta_{2}\). Association parameters quantifying the link between the sub-models are represented by \(\alpha\) terms.
The model is fitted using an EM algorithm, starting values for which are extracted from initial separate longitudinal and survival fits. Pseudo adaptive Gauss - Hermite quadrature is used to evaluate functions of the random effects in the EM algorithm, see Rizopoulos 2012.
Wulfsohn, M.S. and A.A. Tsiatis, A Joint Model for Survival and Longitudinal Data Measured with Error. 1997, International Biometric Society. p. 330
Rizopoulos, D. (2012) Fast fitting of joint models for longitudinal and event time data using a pseudo-adaptive Gaussian quadrature rule. Computational Statistics & Data Analysis 56 (3) p.491-501
# NOT RUN {
#change example data to jointdata object
jointdat2<-tojointdata(longitudinal = simdat2$longitudinal,
survival = simdat2$survival, id = 'id',longoutcome = 'Y',
timevarying = c('time','ltime'),
survtime = 'survtime', cens = 'cens',time = 'time')
#set variables to factors
jointdat2$baseline$study <- as.factor(jointdat2$baseline$study)
jointdat2$baseline$treat <- as.factor(jointdat2$baseline$treat)
#fit multi-study joint model
#note: for demonstration purposes only - max.it restricted to 5
#model would need more iterations to truely converge
onestagefit<-jointmeta1(data = jointdat2, long.formula = Y ~ 1 + time +
+ treat + study, long.rand.ind = c('int', 'time'),
long.rand.stud = c('treat'),
sharingstrct = 'randprop',
surv.formula = Surv(survtime, cens) ~ treat,
study.name = 'study', strat = TRUE, max.it=5)
# }
Run the code above in your browser using DataLab