smote: Synthetic Minority Oversampling Technique to handle class imbalancy in binary classification.
Description
In each iteration, samples one minority class element x1, then one of x1's nearest neighbors: x2.
Both points are now interpolated / convex-combined, resulting in a new virtual data point x3
for the minority class.The method handles factor features, too. The gower distance is used for nearest neighbor
calculation, see daisy.
For interpolation, the new factor level for x3
is sampled from the two given levels of x1 and x2 per feature.
Usage
smote(task, rate, nn = 5L, standardize = TRUE, alt.logic = FALSE)
References
Chawla, N., Bowyer, K., Hall, L., & Kegelmeyer, P. (2000)
SMOTE: Synthetic Minority Over-sampling TEchnique.
In International Conference of Knowledge Based Computer Systems, pp. 46-57.
National Center for Software Technology, Mumbai, India, Allied Press.