cleanse.split_df

Diagnosis of similarity between datasets splitted by train set and set included in the "split_df" class. and cleansing the "split_df" class

A collection of tools that support data splitting, predictive modeling, and model evaluation.
A typical function is to split a dataset into a training dataset and a test dataset.
Then compare the data distribution of the two datasets.
Another feature is to support the development of predictive models and to compare the performance of several predictive models,
helping to select the best model.

Choonghyun Ryu

alookr

Model Classifier for Binary Classification

cleanse.split_df function

<dl><dt>.data</dt>
<dd>an object of class "split_df", usually, a result of a call to split_df().</dd>
<dt>add_character</dt>
<dd>logical. Decide whether to include text variables in the
compare of categorical data. The default value is FALSE, which also not includes character variables.</dd>
<dt>uniq_thres</dt>
<dd>numeric. Set a threshold to removing variables when the ratio of unique values(number of unique values / number of observation) is greater than the set value.</dd>
<dt>missing</dt>
<dd>logical. Set whether to removing variables including missing value</dd>
<dt>...</dt>
<dd>further arguments passed to or from other methods.</dd></dl>

Arguments

Cleansing the dataset for classification modeling — cleanse.split_df

<dl>

<dt>.data</dt>
<dd>an object of class "split_df", usually, a result of a call to split_df().</dd>


<dt>add_character</dt>
<dd>logical. Decide whether to include text variables in the
compare of categorical data. The default value is FALSE, which also not includes character variables.</dd>


<dt>uniq_thres</dt>
<dd>numeric. Set a threshold to removing variables when the ratio of unique values(number of unique values / number of observation) is greater than the set value.</dd>


<dt>missing</dt>
<dd>logical. Set whether to removing variables including missing value</dd>


<dt>...</dt>
<dd>further arguments passed to or from other methods.</dd>

</dl>

cleanse.split_df: Cleansing the dataset for classification modeling

Description

Usage

Value

Arguments

Details

Examples