Learn R Programming

imbalance

imbalance provides a set of tools to work with imbalanced datasets: novel oversampling algorithms, filtering of instances and evaluation of synthetic instances.

Installation

You can install imbalance from Github with:

# install.packages("devtools")
devtools::install_github("ncordon/imbalance")

Examples

Run pdfos algorithm on newthyroid1 imbalanced dataset and plot a comparison between attributes.

library("imbalance")
data(newthyroid1)

newSamples <- pdfos(newthyroid1, numInstances = 80)
# Join new samples with old imbalanced dataset
newDataset <- rbind(newthyroid1, newSamples)
# Plot a visual comparison between both datasets
plotComparison(newthyroid1, newDataset, attrs = names(newthyroid1)[1:3], cols = 2, classAttr = "Class")

After filtering examples with neater:

filteredSamples <- neater(newthyroid1, newSamples, iterations = 500)
#> [1] "12 samples filtered by NEATER"
filteredNewDataset <- rbind(newthyroid1, filteredSamples)
plotComparison(newthyroid1, filteredNewDataset, attrs = names(newthyroid1)[1:3])

Execute method ADASYN using the wrapper provided by the package, comparing imbalance ratios of the dataset before and after oversampling:

imbalanceRatio(glass0)
#> [1] 0.4861111
newDataset <- oversample(glass0, method = "ADASYN")
imbalanceRatio(newDataset)
#> [1] 0.9722222

Copy Link

Version

Install

install.packages('imbalance')

Monthly Downloads

8,850

Version

1.0.2.1

License

GPL (>= 2) | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Ignacio Cord<c3><b3>n

Last Published

April 7th, 2020

Functions in imbalance (1.0.2.1)

wisconsin

Imbalanced binary breast cancer Wisconsin dataset
wracog

Wrapper for rapidly converging Gibbs algorithm.
oversample

Wrapper that encapsulates a collection of algorithms to perform a class balancing preprocessing task for binary class datasets
newthyroid1

Imbalanced binary thyroid gland data
yeast4

Imbalanced binary yeast protein localization sites
pdfos

Probability density function estimation based oversampling
plotComparison

Plots comparison between the original and the new balanced dataset.
racog

Rapidly converging Gibbs algorithm.
trainWrapper

Generic methods to train classifiers
rwo

Random walk oversampling
neater

Fitering of oversampled data based on non-cooperative game theory
imbalanceRatio

Compute imbalance ratio of a binary dataset
iris0

Imbalanced binary iris dataset
glass0

Imbalanced binary glass identification
ecoli1

Imbalanced binary ecoli protein localization sites
banana

Binary banana dataset
imbalance

imabalance: A package to treat imbalanced datasets
mwmote

Majority weighted minority oversampling technique for imbalance dataset learning
haberman

Haberman's survival data