This function lets the user reduce categorical values in a vector. It is tidyverse friendly for use on pipelines
categ_reducer(df, ..., nmin = 0, pmin = 0, pcummax = 100, top = NA,
other_label = "other")
Categorical Vector
Variables. Which variable do you wish to reduce?
Integer. Number of minimum times a value is repeated
Numerical. Porcentage of minimum times a value is repeated
Numerical. Top cumulative porcentage of most repeated values
Integer. Keep the n most frequently repeated values
Character. With which text do you wish to replace the filtered values with?
Other Data Wrangling: balance_data
,
calibrate
, cleanText
,
date_feats
, dateformat
,
formatNum
, formatTime
,
holidays
, impute
,
left
, normalize
,
numericalonly
, ohse
,
one_hot_encoding_commas
,
rbind_full
, removenacols
,
removenarows
, replaceall
,
right
, textFeats
,
textTokenizer
, vector2text
,
year_month
, year_week