Learn R Programming

tsg (version 0.1.1)

generate_frequency: Generate frequency table

Description

Creates frequency tables for one or more categorical variables, optionally grouped by other variables. The function supports various enhancements such as sorting, totals, percentages, cumulative statistics, handling of missing values, and label customization. It returns a single table or a list of frequency tables.

Usage

generate_frequency(
  data,
  ...,
  sort_value = TRUE,
  sort_desc = TRUE,
  sort_except = NULL,
  add_total = TRUE,
  add_percent = TRUE,
  add_cumulative = FALSE,
  add_cumulative_percent = FALSE,
  as_proportion = FALSE,
  include_na = TRUE,
  recode_na = "auto",
  position_total = c("bottom", "top"),
  calculate_per_group = TRUE,
  group_separator = " - ",
  group_as_list = FALSE,
  label_as_group_name = TRUE,
  label_stub = NULL,
  label_na = "Not reported",
  label_total = "Total",
  expand_categories = TRUE,
  convert_factor = FALSE,
  collapse_list = FALSE,
  top_n = NULL,
  top_n_only = FALSE,
  metadata = NULL
)

Value

A frequency table (tibble, possibly nested) or a list of such tables. Additional attributes such as labels, metadata, and grouping information may be attached. The returned object is of class "tsg".

Arguments

data

A data frame (typically tibble) containing the variables to summarize.

...

One or more unquoted variable names (passed via tidy evaluation) for which to compute frequency tables.

sort_value

Logical. If TRUE, frequency values will be sorted.

sort_desc

Logical. If TRUE, sorts in descending order of frequency. If sort_value = FALSE, the category is sorted in ascending order.

sort_except

Optional character vector. Variables to exclude from sorting.

add_total

Logical. If TRUE, adds a total row or value to the frequency table.

add_percent

Logical. If TRUE, adds percent or proportion values to the table.

add_cumulative

Logical. If TRUE, adds cumulative frequency counts.

add_cumulative_percent

Logical. If TRUE, adds cumulative percentages (or proportions if as_proportion = TRUE).

as_proportion

Logical. If TRUE, displays proportions instead of percentages (range 0–1).

include_na

Logical. If TRUE, includes missing values in the frequency table.

recode_na

Character or NULL. Value used to replace missing values in labelled vectors; "auto" will determine a code automatically.

position_total

Character. Where to place the total row: "top" or "bottom".

calculate_per_group

Logical. If TRUE, calculates frequencies within groups defined in data (from group_by() or existing grouping).

group_separator

Character. Separator used when concatenating group values in list output (if group_as_list = TRUE).

group_as_list

Logical. If TRUE, output is a list of frequency tables for each group combination.

label_as_group_name

Logical. If TRUE, uses variable labels as names in the output list; otherwise, uses variable names.

label_stub

Optional character vector used for labeling output tables (e.g., for export or display).

label_na

Character. Label to use for missing (NA) values.

label_total

Character. Label used for the total row/category.

expand_categories

Logical. If TRUE, ensures all categories (including those with zero counts) are included in the output.

convert_factor

Logical. If TRUE, converts labelled variables to factors in the output. See also convert_factor().

collapse_list

Logical. If TRUE and group_as_list = TRUE, collapses the list of frequency tables into a single data frame with group identifiers. See also collapse_list().

top_n

Integer or NULL. If specified, limits the output to the top n categories by frequency.

top_n_only

Logical. If TRUE and top_n is specified, only the top n categories are included, excluding others.

metadata

A named list with optional metadata to attach as attributes, e.g. title, subtitle, and source_note.

See Also

generate_crosstab(), generate_output(), rename_label(), remove_label()

Examples

Run this code
# Using built-in dataset `person_record`


# Basic usage
person_record |>
 generate_frequency(sex)

# Multiple variables
person_record |>
  generate_frequency(sex, age, marital_status)

# Grouping
person_record |>
  dplyr::group_by(sex) |>
  generate_frequency(marital_status)

# Output group as list
person_record |>
  dplyr::group_by(sex) |>
  generate_frequency(marital_status, group_as_list = TRUE)

# Sorting

# default is TRUE
person_record |>
  generate_frequency(age, sort_value = TRUE)

# If FALSE, the output will be sorted by the variable values in ascending order.
person_record |>
  generate_frequency(age, sort_value = FALSE)

# Vignettes for more examples.

Run the code above in your browser using DataLab