Learn R Programming

dataframeexplorer (version 1.0.2)

detect_dupl_cols: Detect if any column of a data.frame is a duplicate of another

Description

It occasionally happens that 2 (or more) columns in dataframe are exactly identical. This could lead to redundant computational cost and unexpected behavior in Machine Learning methods. This function scans though all column combinations of dataframe to examine if any 2 columns are exactly identical.

Usage

detect_dupl_cols(dataset, return_type = "col_names", duplicate_col = "right")

Arguments

dataset

A data.frame

return_type

How to return detected duplicate columns Use "col_names", "col_positions" or "dataset" to return dataset with deleted duplicate columns

duplicate_col

If 2 columns are identical, which of the 2 columns should be treated as duplicate? Use "right" for right column, "left" for left.

Value

A vector of duplicate column names or column positions or dataset with deleted duplicate columns. Use return_type parameter to specify.

Examples

Run this code
# NOT RUN {
detect_dupl_cols(dataset = head(mutate(mtcars, mpg_2 =  mpg)), duplicate_col = "right")
# }

Run the code above in your browser using DataLab