# simJM

##### Simulate longitudinal and survival data for joint models

Simulates longitudinal data with normal error and (Cox-type) survival times
using the inversion method. The function `simJM()`

is a wrapper specifying
all predictors and the resulting data sets. The wrapper calls `rJM()`

to sample
the survival times, a modified version of `rSurvtime()`

from the R package
CoxFlexBoost.

- Keywords
- distribution, survival

##### Usage

```
simJM(nsub = 300, times = seq(0, 120, 1), probmiss = 0.75,
long_setting = "functional",
alpha_setting = if(nonlinear) "linear" else "nonlinear",
dalpha_setting = "zero", sigma = 0.3, long_df = 6, tmax = NULL,
seed = NULL, full = FALSE, file = NULL, nonlinear = FALSE,
fac = FALSE)
rJM(hazard, censoring, x, r,
subdivisions = 1000, tmin = 0, tmax,
file = NULL, ...)
```

##### Arguments

- nsub
number of individuals for which longitudinal data and survival times should be simulated.

- times
vector of time points at which longitudinal measurements are "sampled".

- probmiss
proportion of longitudinal measurements to be set to missing. Used to induce sparsity in the longitudinal measurements.

- long_setting
Specification of the longitudinal trajectories of the sampled subjects. Preset specifications are

`"linear"`

,`"nonlinear"`

and`"functional"`

. See Details.- alpha_setting
specification of the association between survival and longitudinal. Preset specifications are

`"simple"`

,`"linear"`

,`"nonlinear"`

and`"nonlinear2"`

. See Details.- dalpha_setting
specification of the association between survival and the derivative of the longitudinal. Work in progress.

- sigma
standard deviation of the normal error around the true longitudinal measurements.

- long_df
number of basis functions from which functional random intercepts are sampled.

- tmax
For function

`simJM()`

, longest possible survival time, observations are censored after that timepoint. Defaults to`max(times)`

and should not be specified longer than`max(times)`

for longitudinal setting "functional". For function`rJM()`

, latest time point to sample a survival time.- seed
numeric scalar setting the random seed.

- full
logical indicating if only the longitudinal data set should be returned (

`FALSE`

) or additionally also the data for the survival part evaluated on a regular time grid and the longitudinal data set without longitudinal missings (`TRUE`

).- file
name of the data file the generated data set should be stored into (e.g., "simdata.RData") or NULL if the dataset should directly be returned in R.

- nonlinear
If set to

`TRUE`

, a nonlinear interaction between`alpha`

and`mu`

is simulated.- fac
If set to

`TRUE`

, a smooth interaction that varies by a factor is simulated.- hazard
complete hazard function to specify the joint model. Time must be the first argument.

- censoring
function to compute (random) censoring.

- x
matrix of sampled covariate values.

- r
matrix of sampled random coefficients.

- subdivisions
the maximum number of subintervals for the integration.

- tmin
earliest time point to sample a survival time.

- …
further arguments to be passed to

`hazard`

or`censoring`

.

##### Details

The function simulates longitudinal data basing on the given specification at given `times`

.
The full hazard is built from all joint model predictors \(\eta_{\mu}\), \(\eta_{\sigma}\),
\(\eta_{\lambda}\), \(\eta_{\gamma}\), \(\eta_{\alpha}\) as presented in
Koehler, Umlauf, and Greven (2016), see also `jm_bamlss`

. Survival times are sampled using the inversion
method (cf. Bender, Augustin, & Blettner, 2005). Additional censoring and missingness is
introduced. The longitudinal information is censored according to the survival information. The
user can also specify own predictors and use only `rJM`

to simulate survival times
accordingly.

Pre-specified functions for \(\eta_{\mu}\) in `long_setting`

are for `linear`

$$\eta_{\mu i}(t) = 1.25 + r_{1i} + 0.6 \sin(x_{2i}) + (-0.01) t + 0.02 r_{2i} t$$,
for `nonlinear`

$$\eta_{\mu i}(t) = 0.5 + r_{1i} + 0.6 \sin(x_{2i}) + 0.1 (t+1) \exp(-0.075 t)$$
and for `functional`

$$\eta_{\mu i}(t) = 0.5 + r_{1i} + 0.6 \sin(x_{2i}) + 0.1 (t+1) \exp(-0.075 t) + \sum_k \beta_{ki} B(t)$$,
where \(B(.)\) denotes a B-spline basis function and \(\beta_{ki}\) are the sampled penalized
coefficients from `gen_b`

per person.

Prespecified functions for \(\eta_{\alpha}\) in `alpha_setting`

are for `constant`

$$\eta_{\alpha}(t) = 1$$, for `linear`

$$\eta_{\alpha}(t) = 1 - 0.015 t$$, for
`nonlinear`

$$\eta_{\alpha}(t) = \cos((time-20)/20)$$, and for `nonlinear`

$$\eta_{\alpha}(t) = \cos((time-33)/33)$$.

Additionally the fixed functions for \(\eta_{\lambda} = 0.1(t+2)\exp(-0.075t)\) and \(\eta_{\lambda} = 0.1(t+2)\exp(-0.075t)\) are employed.

##### Value

For `full = TRUE`

a list of the three `data.frame`

s is returned:

Simulated dataset in long format including all longitudinal and survival covariates.

Dataset of the time-varying survival predictors which are not subject specific, evaluated at a grid of fixed time points.

Simulated data set prior to generating longitudinal missings. Useful to assess the longitudinal fit.

Covariates within these datasets include a subject identifier id, the sampled survival times survtime, the event indicator event, the time points of longitudinally "observed" measurements obstime, the longitudinal response y, the cumulative hazard at the survival time cumhaz, as well as covariates x1, x2, random effects r1, r2, b1, ..., and the true predictors alpha, lambda, gamma, mu, sigma.

##### References

Hofner, B (2016). CoxFlexBoost: Boosting Flexible Cox Models (with Time-Varying Effects). R package version 0.7-0.

Bender, R., Augustin, T., and Blettner, M. (2005).
Generating Survival Times to Simulate Cox Proportional Hazards Models.
*Statistics in Medicine*, **24**, 1713-1723.

Koehler N, Umlauf N, Beyerlein, A., Winkler, C., Ziegler, A., and Greven S (2016). Flexible Bayesian Additive Joint Models with an
Application to Type 1 Diabetes Research. *(submitted)*

##### See Also

##### Examples

```
# NOT RUN {
## Simulate survival data
## with functional random intercepts and a nonlinear effect
## of time, time-varying association alpha.
d <- simJM(nsub = 300)
head(d)
## Simulate survival data
## with random intercepts/slopes and a linear effect of time,
## constant association alpha.
d <- simJM(nsub = 200, long_setting = "linear",
alpha_setting = "constant")
head(d)
# }
```

*Documentation reproduced from package bamlss, version 1.1-2, License: GPL-2 | GPL-3*