Learn R Programming

scorecard (version 0.1.0)

var_filter: Variable Filter

Description

This function filter variables base on the specified conditions, including minimum iv, maximum na percentage and maximum element percentage.

Usage

var_filter(dt, y, x = NA, iv_limit = 0.02, na_perc_limit = 0.95,
  ele_perc_limit = 0.95, var_rm = NA, var_kp = NA)

Arguments

dt

A data frame with both x (predictor/feature) and y (response/label) variables.

y

Name of y variable.

x

Name vector of x variables. Default NA. If x is NA, all variables exclude y will counted as x variables.

iv_limit

The minimum IV of each kept variable, default 0.02.

na_perc_limit

The maximum NA percent of each kept variable, default 0.95.

ele_perc_limit

The maximum element (excluding NAs) percentage in each kept variable, default 0.95.

var_rm

Name vector of force removed variables, default NA.

var_kp

Name vector of force kept variables, default NA.

Value

A dataframe with y and selected x variables

Examples

Run this code
# NOT RUN {
# Load German credit data
data(germancredit)

# variable filter
dt_selected <- var_filter(germancredit, y = "creditability")

# }

Run the code above in your browser using DataLab