`simJM.Rd`

Simulates longitudinal data with normal error and (Cox-type) survival times
using the inversion method. The function `simJM()`

is a wrapper specifying
all predictors and the resulting data sets. The wrapper calls `rJM()`

to sample
the survival times, a modified version of `rSurvtime()`

from the R package
CoxFlexBoost.

simJM(nsub = 300, times = seq(0, 120, 1), probmiss = 0.75, long_setting = "functional", alpha_setting = if(nonlinear) "linear" else "nonlinear", dalpha_setting = "zero", sigma = 0.3, long_df = 6, tmax = NULL, seed = NULL, full = FALSE, file = NULL, nonlinear = FALSE, fac = FALSE) rJM(hazard, censoring, x, r, subdivisions = 1000, tmin = 0, tmax, file = NULL, ...)

nsub | number of individuals for which longitudinal data and survival times should be simulated. |
---|---|

times | vector of time points at which longitudinal measurements are "sampled". |

probmiss | proportion of longitudinal measurements to be set to missing. Used to induce sparsity in the longitudinal measurements. |

long_setting | Specification of the longitudinal trajectories of the sampled subjects.
Preset specifications are |

alpha_setting | specification of the association between survival and longitudinal. Preset
specifications are |

dalpha_setting | specification of the association between survival and the derivative of the longitudinal. Work in progress. |

sigma | standard deviation of the normal error around the true longitudinal measurements. |

long_df | number of basis functions from which functional random intercepts are sampled. |

tmax | For function |

seed | numeric scalar setting the random seed. |

full | logical indicating if only the longitudinal data set should be returned ( |

file | name of the data file the generated data set should be stored into (e.g., "simdata.RData") or NULL if the dataset should directly be returned in R. |

nonlinear | If set to |

fac | If set to |

hazard | complete hazard function to specify the joint model. Time must be the first argument. |

censoring | function to compute (random) censoring. |

x | matrix of sampled covariate values. |

r | matrix of sampled random coefficients. |

subdivisions | the maximum number of subintervals for the integration. |

tmin | earliest time point to sample a survival time. |

… | further arguments to be passed to |

The function simulates longitudinal data basing on the given specification at given `times`

.
The full hazard is built from all joint model predictors \(\eta_{\mu}\), \(\eta_{\sigma}\),
\(\eta_{\lambda}\), \(\eta_{\gamma}\), \(\eta_{\alpha}\) as presented in
Koehler, Umlauf, and Greven (2016), see also `jm_bamlss`

. Survival times are sampled using the inversion
method (cf. Bender, Augustin, & Blettner, 2005). Additional censoring and missingness is
introduced. The longitudinal information is censored according to the survival information. The
user can also specify own predictors and use only `rJM`

to simulate survival times
accordingly.

Pre-specified functions for \(\eta_{\mu}\) in `long_setting`

are for `linear`

$$\eta_{\mu i}(t) = 1.25 + r_{1i} + 0.6 \sin(x_{2i}) + (-0.01) t + 0.02 r_{2i} t$$,
for `nonlinear`

$$\eta_{\mu i}(t) = 0.5 + r_{1i} + 0.6 \sin(x_{2i}) + 0.1 (t+1) \exp(-0.075 t)$$
and for `functional`

$$\eta_{\mu i}(t) = 0.5 + r_{1i} + 0.6 \sin(x_{2i}) + 0.1 (t+1) \exp(-0.075 t) + \sum_k \beta_{ki} B(t)$$,
where \(B(.)\) denotes a B-spline basis function and \(\beta_{ki}\) are the sampled penalized
coefficients from `gen_b`

per person.

Prespecified functions for \(\eta_{\alpha}\) in `alpha_setting`

are for `constant`

$$\eta_{\alpha}(t) = 1$$, for `linear`

$$\eta_{\alpha}(t) = 1 - 0.015 t$$, for
`nonlinear`

$$\eta_{\alpha}(t) = \cos((time-20)/20)$$, and for `nonlinear`

$$\eta_{\alpha}(t) = \cos((time-33)/33)$$.

Additionally the fixed functions for \(\eta_{\lambda} = 0.1(t+2)\exp(-0.075t)\) and \(\eta_{\lambda} = 0.1(t+2)\exp(-0.075t)\) are employed.

For `full = TRUE`

a list of the three `data.frame`

s is returned:

Simulated dataset in long format including all longitudinal and survival covariates.

Dataset of the time-varying survival predictors which are not subject specific, evaluated at a grid of fixed time points.

Simulated data set prior to generating longitudinal missings. Useful to assess the longitudinal fit.

Hofner, B (2016). CoxFlexBoost: Boosting Flexible Cox Models (with Time-Varying Effects). R package version 0.7-0.

Bender, R., Augustin, T., and Blettner, M. (2005).
Generating Survival Times to Simulate Cox Proportional Hazards Models.
*Statistics in Medicine*, **24**, 1713-1723.

Koehler N, Umlauf N, Beyerlein, A., Winkler, C., Ziegler, A., and Greven S (2016). Flexible Bayesian Additive Joint Models with an
Application to Type 1 Diabetes Research. *(submitted)*

# NOT RUN { ## Simulate survival data ## with functional random intercepts and a nonlinear effect ## of time, time-varying association alpha. d <- simJM(nsub = 300) head(d) ## Simulate survival data ## with random intercepts/slopes and a linear effect of time, ## constant association alpha. d <- simJM(nsub = 200, long_setting = "linear", alpha_setting = "constant") head(d) # }