SMOTEBagging uses both SMOTE (Synthetic Minority Over-sampling TEchnique) and random over-sampling to increase minority instances in each bag of Bagging in order to rebalance class distribution. The manipulated training sets contain equal numbers of majority and minority instances, but the proportions of minority instances from SMOTE and random over-sampling vary for different bags, determined by an assigned re-sampling rate a. The re-sampling rate a is always the multiple of 10, and the function automatically generates a vector of a, therefore users do not need to self-define.
The function requires the target varible to be a factor of 0 and 1, where 1 indicates minority while 0 indicates majority instances. Only binary classification is implemented in this version.
Argument alg specifies the learning algorithm used to train weak learners within the ensemble model. Totally five algorithms are implemented: cart (Classification and Regression Tree), c50 (C5.0 Decision Tree), rf (Random Forest), nb (Naive Bayes), and svm (Support Vector Machine). When using Random Forest as base learner, the ensemble model is consisted of forests and each forest contains a number of trees.
The object class of returned list is defined as modelBag, which can be directly passed to predict() for predicting test instances.