Learn R Programming

AugmenterR (version 0.1.0)

Data Augmentation for Machine Learning on Tabular Data

Description

Implementation of a data augmentation technique based on conditional entropy It was devised by both authors during their masters and is discussed in detail in the second author dissertation. It is able to create novel samples conditioned on a desired value of a categorical attribute, as a way to augment data for classification tasks Tests discussed in the dissertation and future paper present that the technique satisfies several statistical assumptions for the novel samples. It also shows significant improvement for machine learning models trained on small data.

Copy Link

Version

Install

install.packages('AugmenterR')

Monthly Downloads

61

Version

0.1.0

License

MIT + file LICENSE

Maintainer

Rafael Pereira

Last Published

March 18th, 2021

Functions in AugmenterR (0.1.0)

GenerateASingleCandidate

GenerateASingleCandidate Generates a novel sample from a target class and evaluate it against the other classes to check if it satisfies the confidence level returns NA if the generated sample does not satisfy the condition, otherwise returns novel sample
Generate

Generate Asks for a dataframe and generates a new sample returns novel sample along with intervals it contained to revalidate it using confidence levels
GenerateMultipleCandidates

GenerateMultipleCandidates Asks for a dataframe and some parameters and returns multiple novel samples from the target class
ObtainCandidate

ObtainCandidate Asks for a vector and returns a value along with the range it is contained in the attribute Is used alongside other functions when generating a new sample