generate_crosstab: Generate cross-tabulation

Description

Generate cross-tabulation

Usage

generate_crosstab(
  data,
  x,
  ...,
  add_total = TRUE,
  add_total_row = TRUE,
  add_total_column = TRUE,
  add_percent = TRUE,
  as_proportion = FALSE,
  percent_by_column = FALSE,
  name_separator = "_",
  label_separator = "__",
  label_total = "Total",
  label_total_column = NULL,
  label_total_row = NULL,
  label_na = "Not reported",
  label_as_group_name = TRUE,
  label_group_hierarchy = "All",
  include_na = TRUE,
  recode_na = "auto",
  group_separator = " - ",
  group_as_list = FALSE,
  group_as_hierarchy = FALSE,
  calculate_per_group = TRUE,
  expand_categories = TRUE,
  position_total = "bottom",
  sort_column_names = TRUE,
  collapse_list = FALSE,
  convert_factor = FALSE,
  multiple_columns = FALSE,
  multiple_columns_type = c("filtered", "stacked"),
  multiple_columns_filter = 1L,
  metadata = NULL
)

Value

A data frame or a list of data frames containing the cross-tabulation results. If group_as_list is TRUE, the output will be a list of data frames, one for each combination of grouping variable(s). Otherwise, a single data frame is returned. Each data frame includes counts and, if specified, percentages or proportions for each combination of x and the additional variables provided in ....

Arguments

data: A data frame (typically tibble) containing the variables to summarize.
x: The variable to use for the rows of the cross-tabulation.
...: Additional variable(s) to use for the columns of the cross-tabulation. If none are provided, a frequency table for x will be returned.
add_total: Logical. If TRUE, adds total row and/or column.
add_total_row: Logical. If TRUE, adds a total row.
add_total_column: Logical. If TRUE, adds a total column.
add_percent: Logical. If TRUE, adds percent or proportion values to the table.
as_proportion: Logical. If TRUE, displays proportions instead of percentages (range 0–1).
percent_by_column: Logical. If TRUE, percentages are calculated by column; otherwise, by row.
name_separator: Character. Separator used when constructing variable names in the output.
label_separator: Character. Separator used when constructing labels in the output.
label_total: Character. Label used for the total row/category.
label_total_column: Character. Label used for the total column/category.
label_total_row: Character. Label used for the total row/category.
label_na: Character. Label to use for missing (NA) values.
label_as_group_name: Logical. If TRUE, uses the variable label of the grouping variable(s) as the name in the output list.
label_group_hierarchy: Character. Label applied to grand-total entries when group_as_hierarchy = TRUE. Can be a single string (applied to all group levels) or a named character vector keyed by group column name for per-variable labels (e.g. c(sex = "All sexes", employed = "All workers")). Defaults to "All".
include_na: Logical. If TRUE, includes missing values in the cross table.
recode_na: Character or NULL. Value used to replace missing values in labelled vectors; "auto" will determine a code automatically.
group_separator: Character. Separator used when concatenating group values in list output (if group_as_list = TRUE with a single group).
group_as_list: Logical. If TRUE, output is a named list of cross-tabulation tables keyed by group value. With a single group the list is flat; with 2+ groups the list is nested. When combined with group_as_hierarchy = TRUE, a nested list with totals at each level is returned.
group_as_hierarchy: Logical. When TRUE (without group_as_list), inserts grand-total rows into the output. When TRUE together with group_as_list = TRUE, returns a nested named list with a total entry at each level; the total key is formatted as "{var_label}: {label_group_hierarchy}".
calculate_per_group: Logical. If TRUE, calculates the cross-tabulation separately for each group defined by the grouping variable(s).
expand_categories: Logical. If TRUE, ensures that all categories of x are represented in the output, even if they have zero counts.
position_total: Character. Position of the total row/column; either "bottom" or "top" for rows, and "right" or "left" for columns.
sort_column_names: Logical. If TRUE, sorts the column names in the output.
collapse_list: Logical (NOT YET IMPLEMENTED). If TRUE and group_as_list = TRUE, collapses the list of frequency tables into a single data frame with group identifiers. See also collapse_list().
convert_factor: Logical. If TRUE, converts labelled variables to factors in the output. See also convert_factor().
multiple_columns: Logical or NULL. If TRUE, each column in ... is treated as a binary indicator variable. Rows where the column equals multiple_columns_filter are counted per x category and presented as side-by-side frequency/percent columns in a single wide table. Requires at least 2 columns in ...; if fewer are supplied a warning is issued and the function falls back to regular cross-tabulation mode.
multiple_columns_type: Character. Controls how multiple_columns = TRUE handles the additional columns. "filtered" (default) treats each column as a binary indicator and produces a wide table with one column-pair per variable. "stacked" stacks results hierarchically: each column in ... becomes a row group with x categories as columns; multiple_columns_filter is ignored in this mode.
multiple_columns_filter: Scalar value (default 1L). The value to filter on when multiple_columns = TRUE and multiple_columns_type = "filtered". Ignored when multiple_columns_type = "stacked".
metadata: A named list with optional metadata to attach as attributes, e.g. title, subtitle, and source_note.

Examples

Run this code

# Using built-in dataset `person_record`

# Basic usage
person_record |>
 generate_crosstab(marital_status, sex)


# Multiple variables
person_record |>
 generate_crosstab(
  sex,
  seeing,
  hearing,
  walking,
  remembering,
  self_caring,
  communicating
 )

 # Grouping
 person_record |>
   dplyr::group_by(sex) |>
   generate_crosstab(marital_status, employed, group_as_list = TRUE)

 # Nested list with totals at each level (group_as_list + group_as_hierarchy)
 person_record |>
   dplyr::group_by(sex) |>
   generate_crosstab(marital_status, employed,
     group_as_list = TRUE, group_as_hierarchy = TRUE)

# # Percent or proportion by row or column
person_record |>
 generate_crosstab(
   marital_status,
   sex,
   percent_by_column = TRUE
 )

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples