Learn R Programming

SRCS (version 1.0)

ML2: Performance of 6 different preprocessing algorithms over 6 supervised classification algorithms in imbalanced datasets

Description

Dataset with the test accuracy of 6 supervised classification algorithms on three imbalanced datasets that have been previously treated with 6 pre-processing algorithms, also known as a filtering technique. The aim is to study the performance of the pre-processing techniques when coupled with different classification algorithms, hence the target is the preprocessing technique rather than the classification algorithm. There are two data-frame objects associated with two different imbalance ratios (IR = N-/N+ where N- and N+ stand for the number of examples in the majority and minority classes) in the datasets. Object ML2a corresponds to IR = 5 while ML2b corresponds to IR = 7.

Usage

data(ML2)

Arguments

source

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. SMOTE: synthetic minority oversampling technique. Journal of Artificial Intelligence Research 16 (2002), 321-357.

References

C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap. 2009. Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem. In Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 09). Springer-Verlag, Berlin, Heidelberg, 475-482. G. Batista, R. Prati, and M. Monard. 2004. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter 6, 1 (2004), 20-29.

Examples

Run this code
data(ML2)
str(ML2a)
head(ML2a)

Run the code above in your browser using DataLab