This function randomly splits a data frame into three subsets for machine
learning workflows: training, validation, and test sets. The proportions
can be customized and must sum to 1.
A numeric value between 0 and 1 specifying the proportion
of data to allocate to the training set.
validate_prop
A numeric value between 0 and 1 specifying the proportion
of data to allocate to the validation set.
test_prop
A numeric value between 0 and 1 specifying the proportion
of data to allocate to the test set.
seed
(optional) a numeric value to set the random no. seed within function environment.
Details
The function assigns each row to either "train", "validate" or "test" with
the probability defined in the function.
Because each row is assigned a bucket independently, for very small datasets the proportions may not
be as desired. This should not be an issue as data used for `iblm` must be reasonably large.