Learn R Programming

gecko (version 1.0.2)

splitDataset: Split a dataset for model training

Description

Split a dataset for model training while keeping class representativity.

Usage

splitDataset(data, proportion)

Value

list. First element is the train data, second element is the test data.

Arguments

data

dataframe. Containg some sort of classification data. The last column must contain the label data.

proportion

numeric. A value between 0 a 1 determining the proportion of the dataset split between training and testing.

Examples

Run this code
# Binary label case
my_data = data.frame(X = runif(20), Y = runif(20), Z = runif(20), Label =
c(rep("presence", 10), rep("outlier", 10)) )
splitDataset(my_data, 0.8)

# Multi label case
my_data = data.frame(X = runif(60), Y = runif(60), Z = runif(60), Label =
c(rep("A", 20), rep("B", 30), rep("C", 10)) )
splitDataset(my_data, 0.8)

Run the code above in your browser using DataLab