Family object to fit a flexible additive joint model for longitudinal and survival data under a Bayesian approach as presented in Koehler et al. (2017a, b). All parts of the joint model can be specified as structured additive predictors. See the details and examples.

```
## JM family object.
jm_bamlss(...)
```## "bamlss.frame" transformer function
## to set up joint models.
jm_transform(x, y, data, terms, knots, formula, family, subdivisions = 25,
timedependent = c("lambda", "mu", "alpha", "dalpha"), timevar = NULL,
idvar = NULL, alpha = .Machine$double.eps, mu = NULL, sigma = NULL,
sparse = TRUE, nonlinear = FALSE, edf_alt = FALSE, start_mu = NULL,
k_mu = 6, ...)
## Posterior mode optimizing engine.
opt_JM(x, y, start = NULL, weights = NULL, offset = NULL,
criterion = c("AICc", "BIC", "AIC"), maxit = c(100, 1),
nu = c("lambda" = 0.1, "gamma" = 0.1, "mu" = 0.1, "sigma" = 0.1,
"alpha" = 0.1, "dalpha" = 0.1),
update.nu = FALSE, eps = 0.0001, alpha.eps = 0.001, ic.eps = 1e-08,
nback = 40, verbose = TRUE, digits = 4, ...)

jm_mode(x, y, start = NULL, weights = NULL, offset = NULL,
criterion = c("AICc", "BIC", "AIC"), maxit = c(100, 1),
nu = c("lambda" = 0.1, "gamma" = 0.1, "mu" = 0.1, "sigma" = 0.1,
"alpha" = 0.1, "dalpha" = 0.1),
update.nu = FALSE, eps = 0.0001, alpha.eps = 0.001, ic.eps = 1e-08,
nback = 40, verbose = TRUE, digits = 4, ...)

## Sampler function.
sam_JM(x, y, family, start = NULL, weights = NULL, offset = NULL,
n.iter = 1200, burnin = 200, thin = 1, verbose = TRUE, digits = 4,
step = 20, ...)

jm_mcmc(x, y, family, start = NULL, weights = NULL, offset = NULL,
n.iter = 1200, burnin = 200, thin = 1, verbose = TRUE, digits = 4,
step = 20, ...)
## Predict function, set to default in jm_bamlss().
jm_predict(object, newdata,
type = c("link", "parameter", "probabilities", "cumhaz", "loglik"),
dt, steps, id, FUN = function(x) { mean(x, na.rm = TRUE) },
subdivisions = 100, cores = NULL, chunks = 1,
verbose = FALSE, ...)
## Survival plot.
jm_survplot(object, id = 1, dt = NULL, steps = 10,
points = TRUE, rug = !points)

x

The `x`

list, as returned from function
`bamlss.frame`

(and transformed by function `jm_transform()`

),
holding all model matrices and other information that is used for
fitting the model.

y

The model response, as returned from function `bamlss.frame`

.

data

A `data.frame`

or `list`

containing the model
response variable(s) and covariates specified in the `formula`

in long format.
By default the variables are taken from `environment(formula)`

:
typically the environment from which `bamlss`

is called.

terms

The corresponding `terms.bamlss`

object needed for processing.

knots

An optional list containing user specified knots, see the documentation of
function `gam`

.

formula

The corresponding `bamlss.formula`

.

family

The `bamlss.family`

object.

subdivisions

How many time points should be created for each individual.

timedependent

A character vector specifying the names of parameters in `x`

that are time-dependent. Time grid design matrices are only computed for these parameters.

timevar

A character specifying the name of the survival time variable in the data set.

idvar

Depending on the type of data set, this is the name of the variable specifying identifier of individuals.

alpha

Numeric, a starting value for the intercept of the association parameter alpha.

mu

Numeric, a starting value for the intercept of the mu parameter.

sigma

Numeric, a starting value for the intercept of the sigma parameter.

sparse

Logical, indicating if sparse matrix structures are used for updating and sampling of mu parameter model terms.

nonlinear

Logical, indicating if association is nonlinear in mu. See Details on the different model specifications.

edf_alt

Logical, indicating if an alternative computation of estimated degrees of freedom for penalized model terms should be used.

start_mu

Starting values for the computation of mu. For estimating associations which are nonlinear in mu, knot placement is based on these starting values which can improve stability.

k_mu

Number of knots for spline basis of association nonlinear in mu. Reducing this number improves stability of the estimation.

start

A named numeric vector containing possible starting values, the names are based on
function `parameters`

.

weights

Currently not supported.

offset

Currently not supported.

criterion

Information criterion to be used, e.g., for smoothing
variance selection. Options are the corrected AIC `"AICc"`

(see Details), the `"BIC"`

and
`"AIC"`

. Defaults to `"AICc"`

?

maxit

Vector containing the maximum number of iterations for the backfitting
algorithm with `maxit[1]`

defining the iterations for the full model and `maxit[2]`

the iterations within each predictor. `maxit[2]`

defaults to 1 if only one value is
specified.

nu

Vector of step lengths for parameter updates of one Newton-Raphson update for each predictor of the joint model.

update.nu

Should the updating step length be optimized in each iteration
of the backfitting algorithm? Uses `nu`

as starting value if set to `TRUE`

.

eps

The relative convergence tolerance of the backfitting algorithm.

alpha.eps

The relative convergence tolerance of the backfitting algorithm for predictor alpha.

ic.eps

The relative convergence tolerance of the information criterion used, e.g., for smoothing variance selection.

nback

For computing `ic.eps`

, how many iterations back should be included
when computing relative convergence tolerance of the information criterion.

verbose

Print information during runtime of the algorithm.

digits

Set the digits for printing when `verbose = TRUE`

.

n.iter

the number of MCMC iterations.

burnin

the burn-in phase of the sampler, i.e., the number of starting samples that should be removed.

thin

the thinning parameter for MCMC simulation. E.g., `thin = 10`

means,
that only every 10th sampled parameter will be stored.

step

How many times should algorithm runtime information be printed, divides `n.iter`

.

object

A `"bamlss"`

object processed with the JM optimizer function
`opt_JM()`

ans/or sampler function `sam_JM()`

for which the survival plot
should be created.

newdata

Dataset for which to create predictions. Not needed for conditional survival probabilities.

type

Character string indicating which type of predictions to compute. `link`

returns estimates
for all predictors with the respective link functions applied, `"parameter"`

returns the estimates
for all pedictors, `"probabilities"`

returns the survival probabilities conditional on the
survival up to the last longitudinal measurement, and `"cumhaz"`

return the cumulative hazard
up to the survival time or for a time window after the last longitudinal measurement. If `type`

is set to `"loglik"`

, the log-likelihood of the joint model is returned.

id

Integer or character, that specifies the individual for which the plot should be created.

dt

The time window after the last observed measurement for which predictions should be computed.
The default is `0.4 * max(obstime)`

and `obstime`

are the individual's longitudinal measurement times.

steps

Integer, the number of steps for which to evaluate the conditional survival probability
up to `dt`

.

FUN

A function that should be applied on the samples of predictors or
parameters, depending on argument `type`

.

cores

Specifies the number of cores that should be used for prediction. Note that
this functionality is based on the `parallel`

package.

chunks

Should computations be split into `chunks`

? Prediction is then processed
sequentially.

points

Should longitudinal observations be added to the plot.

rug

Should longitudinal observed time points be added on the x-axis to the plot.

…

Currently not used.

We refer to the papers of Koehler et al. (2017a, b) for details on the flexible additive joint model. In short, we model the hazard of subject \(i\) an event at time \(t\) as $$h_{i}(t)= \exp [\eta_{\lambda i}(t)+ \eta_{\gamma i}+\eta_{\alpha i}(\eta_{\mu i}(t), t) ]$$ with predictor \(\eta_{\lambda}\) for all survival covariates that are time-varying or have a time-varying coefficient (including the log baseline hazard), predictor \(\eta_{\gamma}\) for baseline survival covariates, predictor \(\eta_{\alpha}\) representing the potentially time-varying or nonlinear association between the longitudinal marker \(\eta_{\mu}\) and the hazard. The longitudinal response \(y_{ij}\) at time points \(t_{ij}\) is modeled as $$y_{ij}=\eta_{\mu i}(t_{ij})+e_{ij}$$ with independent normal errors \(N(0, \exp[\eta_{\sigma i}(t_{ij})]^2)\).

Each predictor \(\eta_{ki}\) is a structured additive predictor, i.e. a sum of functions of
covariates \(\eta_{ki} = \sum_{m=1}^{M_k} f_{km}(\bm{x}_{ki})\). Each of these functions can be
modeled parametrically or using basis function evaluations from the smooth constructors in
mgcv such as `s`

, `te`

and `ti`

and can
include smooth time-varying, random or spatial effects. For the Bayesian estimation of these effects
we specify corresponding priors: For linear or parametric terms we use vague normal priors, smooth
and random effect terms are regularized by placing generic multivariate normal priors on the
coefficients and for anisotropic smooths, when multiple smoothing variance parameters are involved,
more complex prior are in place (cf. Koehler et al., 2017a). We use inverse Gamma
hyper-priors, i.e. IG(0.001, 0.001) to obtain an inverse Gamma full conditional for the variance
parameters. We estimate the posterior mode by maximizing the log-posterior of the model using a
Newton-Raphson procedure, the posterior mean is obtained via derivative-based Metropolis-Hastings
sampling. We recommend to use posterior mode estimates for a quick model assessment. In order to
draw correct inferences from the model, posterior mean estimates should be computed.
We approximate integration in the survival part of the likelihood using trapezoidal rule. For
posterior mode estimation.

A variety specifications of the association \(\eta_{\alpha i}(\eta_{\mu i}(t), t)\) are possible with an
important distinction between associations which are nonlinear in \(\eta_{\mu}\) for `nonlinear = TRUE`

(Koehler et al. 2017b) or linear where \(\eta_{\alpha i}(\eta_{\mu i}(t), t) = \eta_{\alpha i}(t)\eta_{\mu i}(t)\) for `nonlinear = FALSE`

(Koehler et al. 2017a).

Koehler M, Umlauf N, Beyerlein, A., Winkler, C. Ziegler, A.-G., Greven S (2017). Flexible
Bayesian Additive Joint Models with an Application to Type 1 Diabetes Research.
*Biometrical Journal*. 10.1002/bimj.201600224

Meike Koehler, Nikolaus Umlauf, and Sonja Greven (2018). Nonlinear association structures in
flexible Bayesian additive joint models. *Statistics in Medicine*.
10.1002/sim.7967

# NOT RUN { set.seed(123) ## Simulate survival data ## with random intercepts/slopes and a linear effect of time, ## constant association alpha and no effect of the derivative d <- simJM(nsub = 200, long_setting = "linear", alpha_setting = "constant", dalpha_setting = "zero", full = FALSE) ## Formula of the according joint model f <- list( Surv2(survtime, event, obs = y) ~ s(survtime, bs = "ps"), gamma ~ s(x1, bs = "ps"), mu ~ obstime + s(id, bs = "re") + s(id, obstime, bs = "re"), sigma ~ 1, alpha ~ 1, dalpha ~ -1 ) ## Joint model estimation ## jm_bamlss() sets the default optimizer and sampler function. ## First, posterior mode estimates are computed using function ## opt_JM(), afterwards the sampler sam_JM() is started. b <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id") ## Plot estimated effects. plot(b) ## Predict event probabilities for two individuals ## at 12 time units after their last longitudinal measurement. ## The event probability is conditional on their survival ## up to their last observed measurement. p <- predict(b, type = "probabilities", id = c(1, 2), dt = 12, FUN = c95) print(p) ## Plot of survival probabilities and ## corresponding longitudinal effects ## for individual id. jm_survplot(b, id = 3) jm_survplot(b, id = 30) ## Simulate survival data ## with functional random intercepts and a nonlinear effect ## of time, time-varying association alpha and no effect ## of the derivative. ## Note: This specification is the simJM default. d <- simJM(nsub = 200, full = FALSE) ## Formula of the according joint model ## number of knots for the smooth nonlinear effect of time klong <- 8 f <- list( Surv2(survtime, event, obs = y) ~ s(survtime, bs = "ps"), gamma ~ s(x1, bs = "ps"), mu ~ ti(id, bs = "re") + ti(obstime, bs = "ps", k = klong) + ti(id, obstime, bs = c("re", "ps"), k = c(nlevels(d$id), klong)) + s(x2, bs = "ps"), sigma ~ 1, alpha ~ s(survtime, bs = "ps"), dalpha ~ -1 ) ## Estimating posterior mode only using opt_JM() b_mode <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id", sampler = FALSE) ## Estimating posterior means using sam_JM() ## with starting values generated from posterior mode b_mean <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id", optimizer = FALSE, start = parameters(b_mode), results = FALSE) ## Plot effects. plot(b_mean, model = "alpha") ## Simulate survival data ## with functional random intercepts and an association nonlinear in mu set.seed(234) d <- simJM(nsub = 300, long_setting = "functional", alpha_setting = "nonlinear", nonlinear = TRUE, full = FALSE, probmiss = 0.9) ## Calculate longitudinal model to obtain starting values for mu long_df <- 7 f_start <- y ~ ti(id, bs = "re") + ti(obstime, bs = "ps", k = long_df) + ti(id, obstime, bs = c("re", "ps"), k = c(nlevels(d$id), long_df)) + s(x2, bs = "ps") b_start <- bamlss(f_start, data = d, sampler = FALSE) mu <- predict(b_start)$mu ## Fit joint model with nonlinear association (nonlinear = TRUE) f <- list( Surv2(survtime, event, obs = y) ~ s(survtime, bs = "ps"), gamma ~ x1, mu ~ ti(id, bs = "re") + ti(obstime, bs = "ps", k = long_df) + ti(id, obstime, bs = c("re", "ps"), k = c(nlevels(d$id), long_df)) + s(x2, bs = "ps"), sigma ~ 1, alpha ~ 1, dalpha ~ -1 ) b <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id", nonlinear = TRUE, start_mu = mu, n.iter = 6000, burnin = 2000, thin = 2) plot(b) samplestats(b$samples) # }