Learn R Programming

promor (version 0.2.1)

split_data: Split the data frame to create training and test data

Description

This function can be used to create balanced splits of the protein intensity data in a model_df object to create training and test data

Usage

split_data(model_df, train_size = 0.8, seed = NULL)

Value

A list of data frames.

Arguments

model_df

A model_df object from performing pre_process.

train_size

The size of the training data set as a proportion of the complete data set. Default is 0.8.

seed

Numerical. Random number seed. Default is NULL

Author

Chathurani Ranathunge

Details

This function splits the model_df object in to training and test data sets using random sampling while preserving the original class distribution of the data. Make sure to fix the random number seed with seed for reproducibility

See Also

Examples

Run this code

## Create a model_df object
covid_model_df <- pre_process(covid_fit_df, covid_norm_df)

## Split the data frame into training and test data sets using default settings
covid_split_df1 <- split_data(covid_model_df, seed = 8314)

## Split the data frame into training and test data sets with 70% of the
## data in training and 30% in test data sets
covid_split_df2 <- split_data(covid_model_df, train_size = 0.7, seed = 8314)

## Access training data set
covid_split_df1$training

## Access test data set
covid_split_df1$test

Run the code above in your browser using DataLab