Learn R Programming

alookr (version 0.5.0)

split_by: Split Data into Train and Test Set

Description

The split_by() splits the data.frame or tbl_df into a train set and a test set.

Usage

split_by(.data, ...)

# S3 method for data.frame split_by(.data, target, ratio = 0.7, seed = NULL, ...)

Value

An object of split_by.

Arguments

.data

a data.frame or a tbl_df.

...

further arguments passed to or from other methods.

target

unquoted expression or variable name. the name of the target variable

ratio

numeric. the ratio of the train dataset. default is 0.7

seed

random seed used for splitting

attributes of split_by

The attributes of the split_df class are as follows.:

  • split_seed : integer. random seed used for splitting

  • target : character. the name of the target variable

  • binary : logical. whether the target variable is binary class

  • minority : character. the name of the minority class

  • majority : character. the name of the majority class

  • minority_rate : numeric. the rate of the minority class

  • majority_rate : numeric. the rate of the majority class

Details

The split_df class is created, which contains the split information and criteria to separate the training and the test set.

Examples

Run this code
library(dplyr)

# Credit Card Default Data
head(ISLR::Default)

# Generate data for the example
sb <- ISLR::Default %>%
  split_by(default)

sb

Run the code above in your browser using DataLab