Learn R Programming

smotefamily (version 1.3)

ANS: Adaptive Neighbor Synthetic Majority Oversampling TEchnique

Description

Generate a oversampling dataset from imbalanced dataset using Adaptive Neighbor SMOTE which provides the parameter K to each minority instance automatically

Usage

ANS(X, target, dupSize = 0)

Arguments

X

A data frame or matrix of numeric-attributed dataset

target

A vector of a target class attribute corresponding to a dataset X.

dupSize

A number of vector representing the desired times of synthetic minority instances over the original number of majority instances, 0 for balanced dataset.

Value

data

A resulting dataset consists of original minority instances, synthetic minority instances and original majority instances with a vector of their respective target class appended at the last column

syn_data

A set of synthetic minority instances with a vector of minority target class appended at the last column

orig_N

A set of original instances whose class is not oversampled with a vector of their target class appended at the last column

orig_P

A set of original instances whose class is oversampled with a vector of their target class appended at the last column

K

A vector of parameter K for each minority instance

K_all

The value of parameter C for nearest neighbor process used for identifying outcasts

dup_size

The maximum times of synthetic minority instances over original majority instances in the oversampling

outcast

A set of original minority instances which is defined as minority outcast

eps

The value of eps which determines automatic K

method

The name of oversampling method used for this generated dataset (ANS)

References

Siriseriwan, W. and Sinapiromsaran, K. Adaptive neighbor Synthetic Minority Oversampling TEchnique under 1NN outcast handling.Songklanakarin Journal of Science and Technology.

Examples

Run this code
# NOT RUN {
	data_example = sample_generator(5000,ratio = 0.80)
	genData = ANS(data_example[,-3],data_example[,3])

# }

Run the code above in your browser using DataLab