Learn R Programming

lares (version 4.8.4)

msplit: Split a dataframe for training and testing sets

Description

This function splits automatically a dataframe into train and test datasets. You can define a seed to get the same results every time, but has a default value. You can prevent it from printing the split counter result.

Usage

msplit(df, size = 0.7, seed = 0, print = TRUE)

Arguments

df

Dataframe to split

size

Numeric. Split rate value, between 0 and 1. If set to 1, the train and test set will be the same.

seed

Seed for random split

print

Print summary results

Value

A list with both datasets, summary, and split rate

See Also

Other Machine Learning: ROC(), clusterKmeans(), conf_mat(), export_results(), gain_lift(), h2o_automl(), h2o_predict_API(), h2o_predict_MOJO(), h2o_predict_binary(), h2o_predict_model(), h2o_results(), h2o_selectmodel(), impute(), iter_seeds(), lasso_vars(), model_metrics()

Other Tools: autoline(), bindfiles(), bring_api(), db_download(), db_upload(), export_plot(), export_results(), get_credentials(), h2o_predict_API(), h2o_predict_MOJO(), h2o_predict_binary(), h2o_predict_model(), h2o_selectmodel(), h2o_update(), haveInternet(), image_metadata(), importxlsx(), ip_country(), iter_seeds(), json2vector(), listfiles(), mailSend(), myip(), pass(), quiet(), read.file(), statusbar(), tic(), try_require(), updateLares(), zerovar()

Examples

Run this code
# NOT RUN {
data(dft) # Titanic dataset
splits <- msplit(dft, size = 0.7, seed = 123)
names(splits)
# }

Run the code above in your browser using DataLab