Learn R Programming

fctutils (version 0.0.7)

ft_filter_freq: Filter Factor Levels by Frequency and Recalculate Character Frequencies

Description

Filters out factor levels that occur less than a specified frequency threshold and recalculates character frequencies excluding the removed levels. Offers options to handle NA values and returns additional information.

Usage

ft_filter_freq(
  factor_vec,
  min_freq = 1,
  na.rm = FALSE,
  case = FALSE,
  decreasing = TRUE,
  return_info = FALSE
)

Value

If return_info is FALSE, returns a factor vector with levels filtered by the specified frequency threshold and reordered based on recalculated total character frequency. If return_info is TRUE, returns a list containing the filtered factor vector, removed levels, and character frequency table.

Arguments

factor_vec

A factor vector to be filtered.

min_freq

A positive integer specifying the minimum frequency threshold. Factor levels occurring less than this number will be dropped.

na.rm

Logical. Should NA values be removed before filtering and frequency calculation? Default is FALSE.

case

Logical. Should the character frequency count be case-sensitive? Default is FALSE.

decreasing

Logical. Should the ordering of levels be decreasing by total character frequency? Default is TRUE.

return_info

Logical. Should the function return additional information such as removed levels and character frequencies? Default is FALSE.

Author

Kai Guo

Examples

Run this code
# Example factor vector
factor_vec <- factor(c('apple', 'banana', 'cherry', 'date', 'banana', 'apple', 'fig', NA))

# Filter levels occurring less than 2 times and reorder by character frequency
ft_filter_freq(factor_vec, min_freq = 2)

# Filter levels, remove NA values, and return additional information
result <- ft_filter_freq(factor_vec, min_freq = 2, na.rm = TRUE, return_info = TRUE)
result$filtered_factor
result$removed_levels
result$char_freq_table

Run the code above in your browser using DataLab