smotefamily (version 1.3.1)

SLS: Safe-level SMOTE

Description

Generate synthetic positive instances using Safe-level SMOTE algorithm. Using the parameter "Safe-level" to determine the possible location of synthetic instances.

Usage

SLS(X, target, K = 5, C = 5, dupSize = 0)

Value

data

A resulting dataset consists of original minority instances, synthetic minority instances and original majority instances with a vector of their respective target class appended at the last column

syn_data

A set of synthetic minority instances with a vector of minority target class appended at the last column

orig_N

A set of original instances whose class is not oversampled with a vector of their target class appended at the last column

orig_P

A set of original instances whose class is oversampled with a vector of their target class appended at the last column

K

The value of parameter K for nearest neighbor process used for generating data

K_all

The value of parameter C for nearest neighbor process used for calculating safe-level

dup_size

The maximum times of synthetic minority instances over original majority instances in the oversampling

outcast

A set of original minority instances which has safe-level equal to zero and is defined as the minority outcast

eps

Unavailable for this method

method

The name of oversampling method used for this generated dataset (SLS)

Arguments

X

A data frame or matrix of numeric-attributed dataset

target

A vector of a target class attribute corresponding to a dataset X.

K

The number of nearest neighbors during sampling process

C

The number of nearest neighbors during calculating safe-level process

dupSize

The number or vector representing the desired times of synthetic minority instances over the original number of majority instances

Author

Wacharasak Siriseriwan <wacharasak.s@gmail.com>

References

Bunkhumpornpat, C., Sinapiromsaran, K. and Lursinsap, C. 2009. Safe-level-SMOTE: Safe-level-synthetic minority oversampling technique for handling the class imbalanced problem. Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. 2009, 475-482.

Examples

Run this code
    data_example = sample_generator(5000,ratio = 0.80)
	genData = SLS(data_example[,-3],data_example[,3])
	genData_2 = SLS(data_example[,-3],data_example[,3],K=7, C=5)

Run the code above in your browser using DataCamp Workspace