Last chance! 50% off unlimited learning
Sale ends in
Creates a split of the row ids of a Task into a training set and a test set while optionally stratifying on the target column.
partition(task, ratio = 0.67, stratify = TRUE, ...)# S3 method for TaskRegr
partition(task, ratio = 0.67, stratify = TRUE, bins = 3L, ...)
# S3 method for TaskClassif
partition(task, ratio = 0.67, stratify = TRUE, ...)
(Task) Task to operate on.
(numeric(1)
)
Ratio of observations to put into the training set.
(logical(1)
)
If TRUE
, stratify on the target variable.
For regression tasks, the target variable is first cut into bins
bins.
See Task$add_strata()
.
(any) Additional arguments, currently not used.
(integer(1)
)
Number of bins to cut the target variable into for stratification.
# NOT RUN {
# regression task
task = tsk("boston_housing")
# roughly equal size split while stratifying on the binned response
split = partition(task, ratio = 0.5)
data = data.frame(
y = c(task$truth(split$train), task$truth(split$test)),
split = rep(c("train", "predict"), lengths(split))
)
boxplot(y ~ split, data = data)
# classification task
task = tsk("pima")
split = partition(task)
# roughly same distribution of the target label
prop.table(table(task$truth(split$train)))
prop.table(table(task$truth(split$test)))
# }
Run the code above in your browser using DataLab