Generate synthetic positive instances using Borderline-SMOTE algorithm. The number of majority neighbor of each minority instance is used to divide minority instances into 3 groups; SAFE/DANGER/NOISE, only the DANGER are used to generate synthetic instances.
BLSMOTE(X,target,K=5,C=5,dupSize=0,method =c("type1","type2"))
A resulting dataset consists of original minority instances, synthetic minority instances and original majority instances with a vector of their respective target class appended at the last column
A set of synthetic minority instances with a vector of minority target class appended at the last column
A set of original instances whose class is not oversampled with a vector of their target class appended at the last column
A set of original instances whose class is oversampled with a vector of their target class appended at the last column
The value of parameter K for nearest neighbor process used for generating data
The value of parameter C for nearest neighbor process used for determining SAFE/DANGER/NOISE
The maximum times of synthetic minority instances over original majority instances in the oversampling
Unavailable for this method
Unavailable for this method
The name of oversampling method and type used for this generated dataset (BLSMOTE type1/2)
A data frame or matrix of numeric-attributed dataset
A vector of a target class attribute corresponding to a dataset X.
The number of nearest neighbors during sampling process
The number of nearest neighbors during calculating safe-level process
The number or vector representing the desired times of synthetic minority instances over the original number of majority instances, 0 for duplicating until balanced
A parameter to indicate which type of Borderline-SMOTE presented in the paper is used
Wacharasak Siriseriwan <wacharasak.s@gmail.com>
Han, H., Wang, W.Y. and Mao, B.H. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I (ICIC'05), De-Shuang Huang, Xiao-Ping Zhang, and Guang-Bin Huang (Eds.), Vol. Part I. Springer-Verlag, Berlin, Heidelberg, 2005. 878-887. DOI=http://dx.doi.org/10.1007/11538059_91
data_example = sample_generator(5000,ratio = 0.80)
genData = BLSMOTE(data_example[,-3],data_example[,3])
genData_2 = BLSMOTE(data_example[,-3],data_example[,3],K=7, C=5, method = "type2")
Run the code above in your browser using DataLab