data_partition: Partition data into a test and a training set

Description

Creates a training and a test set based on a dataframe. Can also be stratified (i.e., evenly spread a given factor) using the group argument.

Usage

data_partition(data, training_proportion = 0.7, group = NULL, seed = NULL, ...)

Value

A list of two data frames, named test and training.

Arguments

data: A data frame, or an object that can be coerced to a data frame.
training_proportion: The proportion (between 0 and 1) of the training set. The remaining part will be used for the test set.
group: A character vector indicating the name(s) of the column(s) used for stratified partitioning.
seed: A random number generator seed. Enter an integer (e.g. 123) so that the random sampling will be the same each time you run the function.
...: Other arguments passed to or from other functions.

Examples

Run this code

df <- iris
df$Smell <- rep(c("Strong", "Light"), 75)

data_partition(df)
data_partition(df, group = "Species")
data_partition(df, group = c("Species", "Smell"))

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples