ML2: Performance of 6 different preprocessing algorithms over 6 supervised classification algorithms in imbalanced datasets
Description
Dataset with the test accuracy of 6 supervised classification algorithms on three imbalanced datasets that have been
previously treated with 6 pre-processing algorithms, also known as a filtering technique. The aim is to study the
performance of the pre-processing techniques when coupled with different classification algorithms, hence the target
is the preprocessing technique rather than the classification algorithm. There are two data-frame objects associated with two
different imbalance ratios (IR = N-/N+ where N- and N+ stand for the number of examples in the majority and minority classes) in the datasets.
Object ML2a corresponds to IR = 5 while ML2b corresponds to IR = 7.source
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. SMOTE: synthetic minority oversampling
technique. Journal of Artificial Intelligence Research 16 (2002), 321-357.References
C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap. 2009. Safe-Level-SMOTE: Safe-Level-Synthetic
Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem. In Proceedings of
the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 09).
Springer-Verlag, Berlin, Heidelberg, 475-482.
G. Batista, R. Prati, and M. Monard. 2004. A study of the behavior of several methods for balancing machine
learning training data. ACM SIGKDD Explorations Newsletter 6, 1 (2004), 20-29.