unbalanced (version 2.0)

ubBalance: Balance wrapper

Description

The function implements several techniques to re-balance or remove noisy instances in unbalanced datasets.

Usage

ubBalance(X, Y, type="ubSMOTE", positive=1, percOver=200, percUnder=200, k=5, perc=50, method="percPos", w=NULL, verbose=FALSE)

Arguments

X
the input variables of the unbalanced dataset.
Y
the response variable of the unbalanced dataset.
type
the balancing technique to use (ubOver, ubUnder, ubSMOTE, ubOSS, ubCNN, ubENN, ubNCL, ubTomek).
positive
the majority class of the response variable.
percOver
parameter used in ubSMOTE
percUnder
parameter used in ubSMOTE
k
parameter used in ubOver, ubSMOTE, ubCNN, ubENN, ubNCL
perc
parameter used in ubUnder
method
parameter used in ubUnder
w
parameter used in ubUnder
verbose
print extra information (TRUE/FALSE)

Value

The function returns a list:
X
input variables
Y
response variable
id.rm
index of instances removed if availble in the technique selected

Details

The argument type can take the following values: "ubOver" (over-sampling), "ubUnder" (under-sampling), "ubSMOTE" (SMOTE), "ubOSS" (One Side Selection), "ubCNN" (Condensed Nearest Neighbor), "ubENN" (Edited Nearest Neighbor), "ubNCL" (Neighborhood Cleaning Rule), "ubTomek" (Tomek Link).

References

Dal Pozzolo, Andrea, et al. "Racing for unbalanced methods selection." Intelligent Data Engineering and Automated Learning - IDEAL 2013. Springer Berlin Heidelberg, 2013. 24-31.

See Also

ubRacing, ubOver, ubUnder, ubSMOTE, ubOSS, ubCNN, ubENN, ubNCL, ubTomek

Examples

Run this code
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

#balance the dataset
data<-ubBalance(X= input, Y=output, type="ubSMOTE", percOver=300, percUnder=150, verbose=TRUE)
balancedData<-cbind(data$X,data$Y)

Run the code above in your browser using DataLab