imbalance v0.1.1

0

Monthly downloads

0th

Percentile

Preprocessing Algorithms for Imbalanced Datasets

Algorithms to treat imbalanced datasets. Imbalanced datasets usually damage the performance of the classifiers. Thus, it is important to treat data before applying a classifier algorithm. This package includes recent preprocessing algorithms in the literature.

Readme

imbalance

Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. Build Status minimal R version CRAN\_Status\_Badge packageversion

imbalance provides a set of tools to work with imbalanced datasets: novel oversampling algorithms, filtering of instances and evaluation of synthetic instances.

Installation

You can install imbalance from Github with:

# install.packages("devtools")
devtools::install_github("ncordon/imbalance")

Examples

Run pdfos algorithm on newthyroid1 imbalanced dataset and plot a comparison between attributes.

library("imbalance")
data(newthyroid1)

newSamples <- pdfos(newthyroid1, numInstances = 80)
# Join new samples with old imbalanced dataset
newDataset <- rbind(newthyroid1, newSamples)
# Plot a visual comparison between both datasets
plotComparison(newthyroid1, newDataset, attrs = names(newthyroid1)[1:3], cols = 2, classAttr = "Class")

After filtering examples with neater:

filteredSamples <- neater(newthyroid1, newSamples, iterations = 500)
#> [1] "15 samples filtered by NEATER"
filteredNewDataset <- rbind(newthyroid1, filteredSamples)
plotComparison(newthyroid1, filteredNewDataset, attrs = names(newthyroid1)[1:3])

Functions in imbalance

Name Description
pdfos Probability density function estimation based oversampling
haberman Haberman's survival data
plotComparison Plots comparison between the original and the new balanced dataset.
ecoli1 Imbalanced binary ecoli protein localization sites
neater Fitering of oversampled data based on non-cooperative game theory
newthyroid1 Imbalanced binary thyroid gland data
glass0 Imbalanced binary glass identification
iris0 Imbalanced binary iris dataset
wracog Wrapper for rapidly converging Gibbs algorithm.
mwmote Majority weighted minority oversampling technique for imbalance dataset learning
yeast4 Imbalanced binary yeast protein localization sites
imabalace imabalance: A package to treat imbalanced datasets
trainWrapper Generic methods to train classifiers
wisconsin Imbalanced binary breast cancer Wisconsin dataset
racog Rapidly converging Gibbs algorithm.
rwo Random walk oversampling
No Results!

Vignettes of imbalance

Name
imbalance.Rmd
institute-of-mathematical-statistics.csl
kernel-estimation.png
monte-carlo.png
references.bib
smote-flaws.png
No Results!

Last month downloads

Details

Type Package
License GPL (>= 2) | file LICENSE
Encoding UTF-8
LazyData true
BugReports http://github.com/ncordon/imbalance/issues
URL http://github.com/ncordon/imbalance
RoxygenNote 6.0.1
VignetteBuilder knitr
LinkingTo Rcpp, RcppArmadillo
NeedsCompilation yes
Packaged 2017-11-15 14:58:09 UTC; nuwanda
Repository CRAN
Date/Publication 2017-11-15 15:18:51 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/imbalance)](http://www.rdocumentation.org/packages/imbalance)