powered by
Identifies categories in a character or factor vector that appear less frequently than a specified threshold.
detect_categorical_outliers(data, min_freq = 0.01)
A data frame summarizing the categories:
The name of the level.
Absolute frequency.
Relative frequency.
Logical flag.
A vector (character or factor).
Numeric. The minimum percentage (0 to 1) required to be considered normal. Defaults to 0.01 (1 percent).
The function calculates the relative frequency of each unique level. If the frequency is below min_freq, the category is flagged as an outlier.
min_freq
cities <- c(rep("Madrid", 10), "Barcalona") detect_categorical_outliers(cities, min_freq = 0.1)
Run the code above in your browser using DataLab