Learn R Programming

datawizard (version 0.4.0)

data_partition: Partition data into a test and a training set

Description

Creates a training and a test set based on a dataframe. Can also be stratified (i.e., evenly spread a given factor) using the group argument.

Usage

data_partition(data, training_proportion = 0.7, group = NULL, seed = NULL, ...)

Arguments

data

A data frame, or an object that can be coerced to a data frame.

training_proportion

The proportion (between 0 and 1) of the training set. The remaining part will be used for the test set.

group

A character vector indicating the name(s) of the column(s) used for stratified partitioning.

seed

A random number generator seed. Enter an integer (e.g. 123) so that the random sampling will be the same each time you run the function.

...

Other arguments passed to or from other functions.

Value

A list of two data frames, named test and training.

See Also

Examples

Run this code
# NOT RUN {
df <- iris
df$Smell <- rep(c("Strong", "Light"), 75)

data_partition(df)
data_partition(df, group = "Species")
data_partition(df, group = c("Species", "Smell"))
# }

Run the code above in your browser using DataLab