Usage
trainControl(method = "boot",
number = ifelse(grepl("cv", method), 10, 25),
repeats = ifelse(grepl("cv", method), 1, number),
p = 0.75,
initialWindow = NULL,
horizon = 1,
fixedWindow = TRUE,
verboseIter = FALSE,
returnData = TRUE,
returnResamp = "final",
savePredictions = FALSE,
classProbs = FALSE,
summaryFunction = defaultSummary,
selectionFunction = "best",
preProcOptions = list(thresh = 0.95, ICAcomp = 3, k = 5),
sampling = NULL,
index = NULL,
indexOut = NULL,
timingSamps = 0,
predictionBounds = rep(FALSE, 2),
seeds = NA,
adaptive = list(min = 5, alpha = 0.05,
method = "gls", complete = TRUE),
trim = FALSE,
allowParallel = TRUE)
Arguments
method
The resampling method: boot
, boot632
, cv
, repeatedcv
,
LOOCV
, LGOCV
(for repeated training/test splits), none
(only fits one model to the entire training set),
number
Either the number of folds or number of resampling iterations
repeats
For repeated k-fold cross-validation only: the number of complete sets of folds to compute
verboseIter
A logical for printing a training log.
returnData
A logical for saving the data
returnResamp
A character string indicating how much of the resampled summary metrics should be saved. Values can be ``final'', ``all'' or ``none''
savePredictions
a logical to save the hold-out predictions for each resample
p
For leave-group out cross-validation: the training percentage
initialWindow, horizon, fixedWindow
classProbs
a logical; should class probabilities be computed for classification models (along with predicted values) in each resample?
summaryFunction
a function to compute performance metrics across resamples. The arguments to the function should be the same as those in defaultSummary
. selectionFunction
the function used to select the optimal tuning parameter. This can be a name of the function or the function itself. See best
for details and other options. preProcOptions
A list of options to pass to preProcess
. The type of pre-processing (e.g. center, scaling etc) is passed in via the preProc
option in train
. sampling
PLACEHOLDER FOR SAMPLING DOCUMENTATION
index
a list with elements for each resampling iteration. Each list element is a vector of integers corresponding to the rows used for training at that iteration.
indexOut
a list (the same length as index
) that dictates which data are held-out for each resample (as integers). If NULL
, then the unique set of samples not contained in index
is used.
timingSamps
the number of training set samples that will be used to measure the time for predicting samples (zero indicates that the prediction time should not be estimated.
predictionBounds
a logical or numeric vector of length 2 (regression only). If logical, the predictions can be constrained to be within the limit of the training set outcomes. For example, a value of c(TRUE, FALSE)
would only constrain the lower end of predic
seeds
an optional set of integers that will be used to set the seed at each resampling iteration. This is useful when the models are run in parallel. A value of NA
will stop the seed from being set within the worker processes while a value of
adaptive
a list used when method
is "adaptive_cv"
, "adaptive_boot"
or "adaptive_LGOCV"
. See Details below.
trim
a logical. If TRUE
the final model in object$finalModel
may have some components of the object removed so reduce the size of the saved object. The predict
method will still work, but some other features of the model
allowParallel
if a parallel backend is loaded and available, should the function use it?