unbalanced (version 2.0)

ubOSS: One Side Selection

Description

One Side Selection is an undersampling method resulting from the application of Tomek links followed by the application of Condensed Nearest Neighbor.

Usage

ubOSS(X, Y, verbose = TRUE)

Arguments

X
the input variables of the unbalanced dataset.
Y
the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.
verbose
print extra information (TRUE/FALSE)

Value

The function returns a list:
X
input variables
Y
response variable

Details

In order to compute nearest neighbors, only numeric features are allowed.

References

M. Kubat, S. Matwin, et al. Addressing the curse of imbalanced training sets: one-sided selection. In MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, pages 179-186. MORGAN KAUFMANN PUBLISHERS, INC., 1997.

See Also

ubBalance

Examples

Run this code
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubOSS(X=input, Y= output)
newData<-cbind(data$X, data$Y)

Run the code above in your browser using DataLab