Learn R Programming

scutr (version 0.2.0)

undersample_hclust: Undersample a dataset by hierarchical clustering.

Description

Undersample a dataset by hierarchical clustering.

Usage

undersample_hclust(data, cls, cls_col, m, k = 5, h = NA, ...)

Value

Undersampled dataframe containing only cls.

Arguments

data

Dataset to be undersampled.

cls

Majority class that will be undersampled.

cls_col

Column in data containing class memberships.

m

Number of samples in undersampled dataset.

k

Number of clusters to derive from clustering.

h

Height at which to cut the clustering tree. k must be NA for this to be used.

...

Additional arguments passed to dist().

Examples

Run this code
table(iris$Species)
undersamp <- undersample_hclust(iris, "setosa", "Species", 15)
nrow(undersamp)

Run the code above in your browser using DataLab