powered by
This function lets the user balance a given data.frame by resampling with a given relation rate and a binary feature.
balance_data(df, variable, rate = 1, seed = 0)
Vector or Dataframe. Contains different variables in each column, separated by a specific character
Character. Which binary variable should we use to resample df
Numeric. How many X for every Y we need? Default: 1. If there are more than 2 unique values, rate will represent percentage for number of rows
Numeric. Seed to replicate and obtain same values
Other Data Wrangling: categ_reducer(), cleanText(), date_cuts(), date_feats(), dateformat(), formatNum(), formatTime(), holidays(), impute(), left(), normalize(), numericalonly(), ohe_commas(), ohse(), rbind_full(), removenacols(), removenarows(), replaceall(), right(), textFeats(), textTokenizer(), vector2text(), year_month(), year_week()
categ_reducer()
cleanText()
date_cuts()
date_feats()
dateformat()
formatNum()
formatTime()
holidays()
impute()
left()
normalize()
numericalonly()
ohe_commas()
ohse()
rbind_full()
removenacols()
removenarows()
replaceall()
right()
textFeats()
textTokenizer()
vector2text()
year_month()
year_week()
# NOT RUN { data(dft) # Titanic dataset df <- balance_data(dft, "Survived", rate = 1, seed = 123) freqs(df, Survived) # }
Run the code above in your browser using DataLab