Learn R Programming

SmartMeterAnalytics (version 1.1.1)

smote: Synthetic minority oversampling (SMOTE)

Description

Performs oversampling by creating new instances.

Usage

smote(
  Variables,
  Classes,
  subset_use = NULL,
  k = 5,
  use_nearest = TRUE,
  proportions = 0.9,
  equalise_with_undersampling = FALSE,
  safe = FALSE
)

Value

a list containing new independent variables data.frame and new class labels

Arguments

Variables

the data.frame of independent variables that should be used to create new instances

Classes

the class labels in the prediction problem

subset_use

a specific subset only is used for the oversampling. If NULL, everything is used.

k

the number of neigbours for generation

use_nearest

should only the nearest neighbours be used? (very slow)

proportions

to which proportion (of the biggest class) should the classes be equalized

equalise_with_undersampling

should additional undersampling be performed?

safe

should a safe version of SMOTE be used?

Author

Ilya Kozlovskiy, Konstantin Hopf konstantin.hopf@uni-bamberg.de

Details

SMOTE is used to generate synthetic datapoints of a smaller class, for example to overcome the problem of imbalanced classes in classification.