ResampleInstance
, when given the size of the data set.
makeResampleDesc(method, predict = "test", ..., stratify = FALSE, stratify.cols = NULL)
character(1)
]
CV for cross-validation, LOO for leave-one-out, RepCV for
repeated cross-validation, Bootstrap for out-of-bag bootstrap, Subsample for
subsampling, Holdout for holdout.character(1)
]
What to predict during resampling: train, test or both sets.
Default is test.integer(1)
]numeric(1)
]integer(1)
]iters = folds * reps
.
Default is 10.integer(1)]
RepCV
.
Here iters = folds * reps
. Default is 10.logical(1)
]
Should stratification be done for the target variable?
For classification tasks, this means that the resampling strategy is applied to all classes
individually and the resulting index sets are joined to make sure that the proportion of
observations in each training set is as in the original data set. Useful for imbalanced class sizes.
For survival tasks stratification is done on the events, resulting in training sets with comparable
censoring rates.character
]
Stratify on specific columns referenced by name. All columns have to be factors.
Note that you have to ensure yourself that stratification is possible, i.e.
that each strata contains enough observations.
This argument and stratify
are mutually exclusive.ResampleDesc
].
setAggregation
.setAggregation
.setAggregation
.makeFixedHoldoutInstance
.Object slots:
character(1)
]integer(1)
]character(1)
]logical(1)
]ResamplePrediction
,
ResampleResult
,
getRRPredictions
,
makeResampleInstance
,
resample
# Bootstraping
makeResampleDesc("Bootstrap", iters = 10)
makeResampleDesc("Bootstrap", iters = 10, predict = "both")
# Subsampling
makeResampleDesc("Subsample", iters = 10, split = 3/4)
makeResampleDesc("Subsample", iters = 10)
# Holdout a.k.a. test sample estimation
makeResampleDesc("Holdout")
Run the code above in your browser using DataLab