Free Access Week - Data Engineering + BI
Data engineering and BI courses are free!
Free Access Week - Jun 2-8

scorecardModelUtils (version 0.0.1.0)

vif_filter: Removing multicollinearity from a model using vif test

Description

The function takes a dataset with the starting variables and target only. The vif is calculated and if the maximum vif value is more than the threshold, the variable is dropped from the model and the vif's are recomputed. These steps of computing vif and dropping variable keep iterating till the maximum vif value is less than or equal to the threshold.

Usage

vif_filter(base, target, threshold = 2)

Arguments

base

input dataframe with set of final variables only along with target

target

column / field name for the target variable to be passed as string (must be 0/1 type)

threshold

threshold value for vif (default value is 2)

Value

An object of class "vif_filter" is a list containing the following components:

vif_table

vif table post vif filtering

model

the model used for vif calculation

retain_var_list

variables remaining in the model post vif filter as an array

dropped_var_list

variables dropped from the model in vif filter step

threshold

threshold

Examples

Run this code
# NOT RUN {
data <- iris
suppressWarnings(RNGversion('3.5.0'))
set.seed(11)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
vif_data_list <- vif_filter(base = data,target = "Y")
vif_data_list$vif_table
vif_data_list$model
vif_data_list$retain_var_list
vif_data_list$dropped_var_list
vif_data_list$threshold
# }

Run the code above in your browser using DataLab