This method create the kFoldPartition object, from it is possible create the dataset partitions to train, test and optionally to validation.
create_kfold_partition(
mdata,
k = 10,
method = c("random", "iterative", "stratified")
)
A mldr dataset.
The number of desirable folds. (Default: 10)
The method to split the data. The default methods are:
Split randomly the folds.
Split the folds considering the labels proportions individually. Some specific label can not occurs in all folds.
Split the folds considering the labelset proportions.
You can also create your own partition method. See the note and example sections to more details. (Default: "random")
An object of type kFoldPartition.
Sechidis, K., Tsoumakas, G., & Vlahavas, I. (2011). On the stratification of multi-label data. In Proceedings of the Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD (pp. 145-158).
How to create the datasets from folds
Other sampling:
create_holdout_partition()
,
create_random_subset()
,
create_subset()
# NOT RUN {
k10 <- create_kfold_partition(toyml, 10)
k5 <- create_kfold_partition(toyml, 5, "stratified")
sequencial_split <- function (mdata, r) {
S <- list()
amount <- trunc(r * mdata$measures$num.instances)
indexes <- c(0, cumsum(amount))
indexes[length(r)+1] <- mdata$measures$num.instances
S <- lapply(seq(length(r)), function (i) {
seq(indexes[i]+1, indexes[i+1])
})
S
}
k3 <- create_kfold_partition(toyml, 3, "sequencial_split")
# }
Run the code above in your browser using DataLab