unbalanced (version 2.0)

ubSMOTE: SMOTE

Description

Function that implements SMOTE (synthetic minority over-sampling technique)

Usage

ubSMOTE(X, Y, perc.over = 200, k = 5, perc.under = 200, verbose = TRUE)

Arguments

X
the input variables of the unbalanced dataset.
Y
the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.
perc.over
per.over/100 is the number of new instances generated for each rare instance. If perc.over < 100 a single instance is generated.
k
the number of neighbours to consider as the pool from where the new examples are generated
perc.under
perc.under/100 is the number of "normal" (majority class) instances that are randomly selected for each smoted observation.
verbose
print extra information (TRUE/FALSE)

Value

The function returns a list:
X
input variables
Y
response variable

Details

Y must be a factor.

References

Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." arXiv preprint arXiv:1106.1813 (2011).

See Also

ubBalance

Examples

Run this code
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubSMOTE(X=input, Y= output)
newData<-cbind(data$X, data$Y)

Run the code above in your browser using DataCamp Workspace