Learn R Programming

UBL (version 0.0.3)

RandUnderRegress: Random under-sampling for imbalanced regression problems

Description

This function performs a random under-sampling strategy for imbalanced regression problems. Essentially, a percentage of cases of the "class(es)" (bumps below a relevance threshold defined) selected by the user are randomly removed. Alternatively, the strategy can be applied to either balance all the existing "classes"" or to "smoothly invert" the frequency of the examples in each "class".

Usage

RandUnderRegress(form, dat,  rel = "auto", thr.rel = 0.5, 
                 C.perc = "balance", repl = FALSE)

Arguments

form
A formula describing the prediction problem.
dat
A data frame containing the original imbalanced data set.
rel
The relevance function which can be automatically ("auto") determined (the default) or may be provided by the user through a matrix with interpolating points.
thr.rel
A number indicating the relevance threshold below which a case is considered as belonging to the normal "class".
C.perc
A list containing the under-sampling percentage/s to apply to all/each "class" (bump) obtained with the relevance threshold. Examples are randomly removed in each normal "class" according to a percentage. Moreover, different percentages m
repl
A boolean value controlling the possibility of having repetition of examples in the under-sampled data set. Defaults to FALSE.

Value

  • The function returns a data frame with the new data set resulting from the application of the random under-sampling strategy.

Details

This function performs a random under-sampling strategy for dealing with imbalanced regression problems. The examples removed are randomly selected among the examples belonging to the normal "class(es)" (bump of relevance below the threshold defined). The user can chose one or more bumps to be under-sampled.

See Also

RandOverRegress

Examples

Run this code
data(morley)

C.perc = list(0.5)
myUnd <- RandUnderRegress(Speed~., morley, C.perc=C.perc)
Bal <- RandUnderRegress(Speed~., morley, C.perc= "balance")
Ext <- RandUnderRegress(Speed~., morley, C.perc= "extreme")

Run the code above in your browser using DataLab