Learn R Programming

kbal (version 0.1.2)

drop_multicollin: Drop Multicollinear Columns

Description

Drops multicollinear columns in order of highest correlation using the correlation matrix. This function uses the cor function from the stats package to calculate the correlations between columns.

Usage

drop_multicollin(allx, printprogress = TRUE)

Value

A list containing:

allx_noMC

resulting data matrix of full rank after multicollinear columns have been dropped.

dropped_cols

column names of the dropped columns.

Arguments

allx

a matrix of data to check for multicollinearity. All columns must be numeric.

printprogress

logical to indicate if progress should be printed out to the command line. Default is TRUE.

Examples

Run this code
# \donttest{
# Create data with multicollinearity 
data <- data.frame(x = rnorm(100),
                   y = sample.int(100, 100), 
                   z = runif(100, 3, 6))
test = data.frame(mc_1 = data$x,
                  mc_2 = data$x * 2 + data$y - data$z)
dat = cbind(test, data)
# Run function
mc_check = drop_multicollin(dat)
mc_check$dropped_cols 
# }

Run the code above in your browser using DataLab