
This function estimates a penalized Cox model where only baseline covariates are included as predictors, and then computes a bootstrap optimism correction procedure that is used to validate the predictive performance of the model
pencox_baseline(data, formula, penalty = "ridge", standardize = TRUE,
penalty.factor = 1, n.alpha.elnet = 11, n.folds.elnet = 5,
n.boots = 0, n.cores = 1, verbose = TRUE)
A list containing the following objects:
call
: the function call
pcox.orig
: the penalized Cox model fitted on the
original dataset;
surv.data
: a data frame with the survival data
X.orig
: a data frame with the design matrix used
to estimate the Cox model
n.boots
: number of bootstrap samples;
boot.ids
: a list with the ids of bootstrapped subjects
(when n.boots > 0
);
pcox.boot
: a list where each element is a fitted penalized
Cox model for a given bootstrap sample (when n.boots > 0
).
a data frame with one row for each subject.It
should at least contain a subject id (called id
),
the time to event outcome (time
), and the binary censoring
indicator (event
), plus at least one covariate to
be included in the linear predictor
a formula specifying the variables
in data
to include as predictors in
the penalized Cox model
the type of penalty function used for regularization.
Default is 'ridge'
, other possible values are 'elasticnet'
and 'lasso'
logical argument: should the covariates
be standardized when included in the penalized Cox model? Default is TRUE
a single value, or a vector of values, indicating
whether the covariates (if any) should be penalized (1) or not (0).
Default is penalty.factor = 1
number of alpha values for the two-dimensional
grid of tuning parameteres in elasticnet.
Only relevant if penalty = 'elasticnet'
. Default is 11,
so that the resulting alpha grid is c(1, 0.9, 0.8, ..., 0.1, 0)
number of folds to be used for the selection
of the tuning parameter in elasticnet. Only relevant if
penalty = 'elasticnet'
. Default is 5
number of bootstrap samples to be used in the bootstrap optimism correction procedure. If 0, no bootstrapping is performed
number of cores to use to parallelize the computation
of the CBOCP. If ncores = 1
(default), no parallelization is done.
Pro tip: you can use parallel::detectCores()
to check
how many cores are available on your computer
if TRUE
(default and recommended value), information
on the ongoing computations is printed in the console
Mirko Signorelli
Signorelli, M., Spitali, P., Al-Khalili Szigyarto, C, The MARK-MD Consortium, Tsonaka, R. (2021). Penalized regression calibration: a method for the prediction of survival outcomes using complex longitudinal and high-dimensional data. Statistics in Medicine, 40 (27), 6178-6196. DOI: 10.1002/sim.9178
fit_prclmm
,
fit_prcmlpmm
# generate example data
set.seed(1234)
p = 4 # number of longitudinal predictors
simdata = simulate_prclmm_data(n = 100, p = p, p.relev = 2,
seed = 123, t.values = c(0, 0.2, 0.5, 1, 1.5, 2))
#create dataframe with baseline measurements only
baseline.visits = simdata$long.data[which(!duplicated(simdata$long.data$id)),]
df = cbind(simdata$surv.data, baseline.visits)
df = df[ , -c(5:7)]
do.bootstrap = FALSE
# IMPORTANT: set do.bootstrap = TRUE to compute the optimism correction!
n.boots = ifelse(do.bootstrap, 100, 0)
more.cores = FALSE
# IMPORTANT: set more.cores = TRUE to speed computations up!
if (!more.cores) n.cores = 2
if (more.cores) {
# identify number of available cores on your machine
n.cores = parallel::detectCores()
if (is.na(n.cores)) n.cores = 2
}
form = as.formula(~ baseline.age + marker1 + marker2
+ marker3 + marker4)
base.pcox = pencox_baseline(data = df,
formula = form,
n.boots = n.boots, n.cores = n.cores)
ls(base.pcox)
Run the code above in your browser using DataLab