Generates a trained caret model using the given primary binary classification. Optionally generates a stacked ensemble model if a list of base learners is supplied.
spect_train(
test_prop = 0.2,
censor_type = "half",
bin_slices = 10,
method = "repeatedcv",
resampling_number = 10,
kfold_repeats = 3,
model_algorithm,
base_learner_list = list(),
metric = "Kappa",
rng_seed = 42,
use_parallel = TRUE,
cores = 0,
modeling_data,
event_indicator_var,
survival_time_var,
obs_window
)A list containing all intermediate data sets created by `spect_train`, a trained caret model object, the following parameters passed to `spect_train`: `obs_window`, `survival_time_var`, `event_indicator_var`, `base_learner_list`, `bin_slices`, and the bounds of each interval generated by the training data set.
optional proportion of the data set to reserve for testing
optional method used to determine censorship in a given bin - may be "half", "prev" or "same". see createDiscreteDat for usage.
optional number of intervals to use for predictions.
optional caret parameter
optional for repeated cv
optional number of folds
primary classification algorithm. Trains a stack-ensemble model if `base_learner_list` is supplied, otherwise trains a simple classifier model.
optional list of base learner algorithms
optional metric for model calibration
optional random number generation seed for reproducibility
optioanlly make use of the caret multicore training cluster
optioanl number of cores for multicore training. If zero, spect will attempt to make a good choice. Note: only relevant if `use_parallel` is set to TRUE, otherwise this parameter is ignored.
This data set must have one column for time and one column for the event indicator. The remaining columns are treated as covariates for modeling.
The name of the column containing the event indicator (values in this column must be zero or one).
The name of the column containing the time variable
The last time to use for generating person-period data. Any event occurring after this time will be administratively censored. In general, choosing a time at or near the end of the max observed time will include most events.
Stephen Abrams, stephen.abrams@louisville.edu