train_test_split

A data.frame with independent variables and target variable.

The percentage of train data samples after the partition.

prop

Methods for partition.<ul>
<li>"Random" is to split train &amp; test set randomly.</li>
<li>"OOT" is to split by time for observation over time test.</li>
<li>"byRow" is to split by rownumbers.</li>
</ul>

split_type

The name of the variable that represents the time at which each observation takes place. It is used for "OOT" split.

occur_time

Time points for spliting data sets, e.g. : spliting Actual and Expected data sets.

cut_date

The earliest occurrence time of observations.

start_date

Logical, save results in locally specified folder. Default is FALSE.

save_data

The path for periodically saved data file. Default is "./data".

dir_path

The name for periodically saved data file. Default is "dat".

file_name

Logical. Outputs info. Default is TRUE.

note

Random number seed. Default is 46.

seed

<code>train_test_split</code> Functions for partition of data.


Provides a toolkit for building predictive models in one integrated offering. Contains infrastructure functionalities such as data exploration and preparation, missing values treatment, outliers treatment, variable derivation, variable selection, dimensionality reduction, grid search for hyperparameters, data mining and visualization, model evaluation, strategy analysis etc. 'creditmodel' is designed to make the development of binary classification models (machine learning based models as well as credit scorecard) simpler and faster.
The references including:
1.Anderson, R. (2007). The credit scoring toolkit: Theory and practice for retail credit risk management and decision automation.
2.Find, S. (2012, ISBN13: 9780230347762). Credit scoring, response modelling and insurance rating:A practical guide to forecasting consumer behaviour.

train_test_split: Train-Test-Split

Description

Usage

Arguments

Value

Examples