Generate cross-tabulation
generate_crosstab(
data,
x,
...,
add_total = TRUE,
add_total_row = TRUE,
add_total_column = TRUE,
add_percent = TRUE,
as_proportion = FALSE,
percent_by_column = FALSE,
name_separator = "_",
label_separator = "__",
label_total = "Total",
label_total_column = NULL,
label_total_row = NULL,
label_na = "Not reported",
label_as_group_name = TRUE,
label_group_hierarchy = "All",
include_na = TRUE,
recode_na = "auto",
group_separator = " - ",
group_as_list = FALSE,
group_as_hierarchy = FALSE,
calculate_per_group = TRUE,
expand_categories = TRUE,
position_total = "bottom",
sort_column_names = TRUE,
collapse_list = FALSE,
convert_factor = FALSE,
multiple_columns = FALSE,
multiple_columns_type = c("filtered", "stacked"),
multiple_columns_filter = 1L,
metadata = NULL
)A data frame or a list of data frames containing the cross-tabulation results. If group_as_list is TRUE, the output will be a list of data frames, one for each combination of grouping variable(s). Otherwise, a single data frame is returned. Each data frame includes counts and, if specified, percentages or proportions for each combination of x and the additional variables provided in ....
A data frame (typically tibble) containing the variables to summarize.
The variable to use for the rows of the cross-tabulation.
Additional variable(s) to use for the columns of the cross-tabulation. If none are provided, a frequency table for x will be returned.
Logical. If TRUE, adds total row and/or column.
Logical. If TRUE, adds a total row.
Logical. If TRUE, adds a total column.
Logical. If TRUE, adds percent or proportion values to the table.
Logical. If TRUE, displays proportions instead of percentages (range 0–1).
Logical. If TRUE, percentages are calculated by column; otherwise, by row.
Character. Separator used when constructing variable names in the output.
Character. Separator used when constructing labels in the output.
Character. Label used for the total row/category.
Character. Label used for the total column/category.
Character. Label used for the total row/category.
Character. Label to use for missing (NA) values.
Logical. If TRUE, uses the variable label of the grouping variable(s) as the name in the output list.
Character. Label applied to grand-total entries when group_as_hierarchy = TRUE. Can be a single string (applied to all group levels) or a named character vector keyed by group column name for per-variable labels (e.g. c(sex = "All sexes", employed = "All workers")). Defaults to "All".
Logical. If TRUE, includes missing values in the cross table.
Character or NULL. Value used to replace missing values in labelled vectors; "auto" will determine a code automatically.
Character. Separator used when concatenating group values in list output (if group_as_list = TRUE with a single group).
Logical. If TRUE, output is a named list of cross-tabulation tables keyed by group value. With a single group the list is flat; with 2+ groups the list is nested. When combined with group_as_hierarchy = TRUE, a nested list with totals at each level is returned.
Logical. When TRUE (without group_as_list), inserts grand-total rows into the output. When TRUE together with group_as_list = TRUE, returns a nested named list with a total entry at each level; the total key is formatted as "{var_label}: {label_group_hierarchy}".
Logical. If TRUE, calculates the cross-tabulation separately for each group defined by the grouping variable(s).
Logical. If TRUE, ensures that all categories of x are represented in the output, even if they have zero counts.
Character. Position of the total row/column; either "bottom" or "top" for rows, and "right" or "left" for columns.
Logical. If TRUE, sorts the column names in the output.
Logical (NOT YET IMPLEMENTED). If TRUE and group_as_list = TRUE, collapses the list of frequency tables into a single data frame with group identifiers. See also collapse_list().
Logical. If TRUE, converts labelled variables to factors in the output. See also convert_factor().
Logical or
NULL. If TRUE, each column in ... is treated as a binary indicator variable. Rows where the column equals multiple_columns_filter are counted per x category and presented as side-by-side frequency/percent columns in a single wide table. Requires at least 2 columns in ...; if fewer are supplied a warning is issued and the function falls back to regular cross-tabulation mode.
Character. Controls how multiple_columns = TRUE handles the additional columns. "filtered" (default) treats each column as a binary indicator and produces a wide table with one column-pair per variable. "stacked" stacks results hierarchically: each column in ... becomes a row group with x categories as columns; multiple_columns_filter is ignored in this mode.
Scalar value (default 1L). The value to filter on when multiple_columns = TRUE and multiple_columns_type = "filtered". Ignored when multiple_columns_type = "stacked".
A named list with optional metadata to attach as attributes, e.g. title, subtitle, and source_note.
generate_frequency(), generate_output(), rename_label(), remove_label()
# Using built-in dataset `person_record`
# Basic usage
person_record |>
generate_crosstab(marital_status, sex)
# Multiple variables
person_record |>
generate_crosstab(
sex,
seeing,
hearing,
walking,
remembering,
self_caring,
communicating
)
# Grouping
person_record |>
dplyr::group_by(sex) |>
generate_crosstab(marital_status, employed, group_as_list = TRUE)
# Nested list with totals at each level (group_as_list + group_as_hierarchy)
person_record |>
dplyr::group_by(sex) |>
generate_crosstab(marital_status, employed,
group_as_list = TRUE, group_as_hierarchy = TRUE)
# # Percent or proportion by row or column
person_record |>
generate_crosstab(
marital_status,
sex,
percent_by_column = TRUE
)
Run the code above in your browser using DataLab