datarobot (version 2.18.6)

CreateStratifiedPartition: Create a stratified sampling-based S3 object of class partition for the SetTarget function

Description

Stratified partitioning is supported for binary classification problems and it randomly partitions the modeling data, keeping the percentage of positive class observations in each partition the same as in the original dataset. Stratified partitioning is supported for either Training/Validation/Holdout ("TVH") or cross-validation ("CV") splits. In either case, the holdout percentage (holdoutPct) must be specified; for the "CV" method, the number of cross-validation folds (reps) must also be specified, while for the "TVH" method, the validation subset percentage (validationPct) must be specified.

Usage

CreateStratifiedPartition(
  validationType,
  holdoutPct,
  reps = NULL,
  validationPct = NULL
)

Value

An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a stratified partitioning of the modeling dataset.

Arguments

validationType

character. String specifying the type of partition generated, either "TVH" or "CV".

holdoutPct

integer. The percentage of data to be used as the holdout subset.

reps

integer. The number of cross-validation folds to generate; only applicable when validationType = "CV".

validationPct

integer. The percentage of data to be used as the validation subset.

Details

This function is one of several convenience functions provided to simplify the task of starting modeling projects with custom partitioning options. The other functions are CreateGroupPartition, CreateRandomPartition, and CreateUserPartition.

See Also

CreateGroupPartition, CreateRandomPartition, CreateUserPartition.

Examples

Run this code
CreateStratifiedPartition(validationType = "CV", holdoutPct = 20, reps = 5)

Run the code above in your browser using DataLab