split_by: Split Data into Train and Test Set
Description
The split_by() splits the data.frame or tbl_df into a train set and a test set.
Usage
split_by(.data, ...)# S3 method for data.frame
split_by(.data, target, ratio = 0.7, seed = NULL, ...)
Value
An object of split_by.
Arguments
- .data
a data.frame or a tbl_df.
- ...
further arguments passed to or from other methods.
- target
unquoted expression or variable name. the name of the target variable
- ratio
numeric. the ratio of the train dataset. default is 0.7
- seed
random seed used for splitting
attributes of split_by
The attributes of the split_df class are as follows.:
split_seed : integer. random seed used for splitting
target : character. the name of the target variable
binary : logical. whether the target variable is binary class
minority : character. the name of the minority class
majority : character. the name of the majority class
minority_rate : numeric. the rate of the minority class
majority_rate : numeric. the rate of the majority class
Details
The split_df class is created, which contains the split information and criteria to separate the training and the test set.
Examples
Run this codelibrary(dplyr)
# Credit Card Default Data
head(ISLR::Default)
# Generate data for the example
sb <- ISLR::Default %>%
split_by(default)
sb
Run the code above in your browser using DataLab