Last chance! 50% off unlimited learning
Sale ends in
Split dataframes or matrices into random train and test subsets. Takes the column at the y_index of data as response variable (y) and the rest as the independent variables (X)
train_test_split(
data,
test_size = 0.3,
random_state = NULL,
y_index = ncol(data)
)
A list
of length 4 with elements:
X_train | Training input variables |
X_test | Test input variables |
y_train | Training response variables |
y_test | Test response variables |
Dataset that is going to be split
Represents the proportion of the dataset to include in the test split. Should be between 0.0 and 1.0 (defaults to 0.3)
Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls (defaults to NULL)
Corresponding column index of the response variable y (defaults to last column of data)
data(abalone)
split_list <- train_test_split(abalone, test_size = 0.3)
X_train <- split_list[[1]]
X_test <- split_list[[2]]
y_train <- split_list[[3]]
y_test <- split_list[[4]]
print(head(X_train))
print(head(X_test))
print(head(y_train))
print(head(y_test))
Run the code above in your browser using DataLab