
Last chance! 50% off unlimited learning
Sale ends in
train_test_split
Functions for partition of data.
train_test_split(dat, prop = 0.7, split_type = c("Random", "OOT",
"byRow"), occur_time = NULL, cut_date = NULL, start_date = NULL,
save_data = FALSE, dir_path = tempdir(), file_name = NULL,
note = FALSE, seed = 43)
A data.frame with independent variables and target variable.
The percentage of train data samples after the partition.
Methods for partition.
"Random" is to split train & test set randomly.
"OOT" is to split by time for observation over time test.
"byRow" is to split by rownumbers.
The name of the variable that represents the time at which each observation takes place. It is used for "OOT" split.
Time points for spliting data sets, e.g. : spliting Actual and Expected data sets.
The earliest occurrence time of observations.
Logical, save results in locally specified folder. Default is FALSE.
The path for periodically saved data file. Default is "./data".
The name for periodically saved data file. Default is "dat".
Logical. Outputs info. Default is TRUE.
Random number seed. Default is 46.
A list of indices (train-test)
# NOT RUN {
train_test <- train_test_split(lendingclub,
split_type = "OOT", prop = 0.7,
occur_time = "issue_d", seed = 12, save_data = FALSE)
dat_train = train_test$train
dat_test = train_test$test
# }
Run the code above in your browser using DataLab