cross_tab: Cross-tabulation

Description

Computes a two-way cross-tabulation with optional weights, grouping (including combinations of multiple variables), percentage displays, and inferential statistics.

cross_tab() produces weighted or unweighted contingency tables with row or column percentages, optional grouping via by, and associated Chi-squared tests with an association measure and diagnostic information.

Both x and y variables are required. For one-way frequency tables, use freq() instead.

Usage

cross_tab(
  data,
  x,
  y = NULL,
  by = NULL,
  weights = NULL,
  rescale = FALSE,
  percent = c("none", "column", "row"),
  include_stats = TRUE,
  assoc_measure = c("auto", "cramer_v", "phi", "gamma", "tau_b", "tau_c", "somers_d",
    "lambda", "none"),
  assoc_ci = FALSE,
  correct = FALSE,
  simulate_p = FALSE,
  simulate_B = 2000,
  digits = NULL,
  styled = TRUE,
  show_n = TRUE
)
# S3 method for spicy_cross_table_list
print(x, ...)

Value

A data.frame, list of data.frames, or spicy_cross_table object. When by is used, returns a spicy_cross_table_list.

Arguments

data: A data frame. Alternatively, a vector when using the vector-based interface.
x: Row variable (unquoted).
y: Column variable (unquoted). Mandatory; for one-way tables, use freq().
by: Optional grouping variable or expression. Can be a single variable or a combination of multiple variables (e.g. interaction(vs, am)).
weights: Optional numeric weights.
rescale: Logical. If FALSE (the default), weights are used as-is. If TRUE, rescales weights so total weighted N matches raw N.
percent: One of "none" (the default), "row", "column". Unique abbreviations are accepted (e.g. "n", "r", "c").
include_stats: Logical. If TRUE (the default), computes Chi-squared and an association measure (see assoc_measure).
assoc_measure: Character. Which association measure to report. "auto" (default) selects Kendall's Tau-b when both variables are ordered factors and Cramer's V otherwise. Other choices: "cramer_v", "phi", "gamma", "tau_b", "tau_c", "somers_d", "lambda", "none".
assoc_ci: Logical. If TRUE, includes the 95 percent confidence interval of the association measure in the note. Defaults to FALSE.
correct: Logical. If FALSE (the default), no continuity correction is applied. If TRUE, applies Yates correction (only for 2x2 tables).
simulate_p: Logical. If FALSE (the default), uses asymptotic p-values. If TRUE, uses Monte Carlo simulation.
simulate_B: Integer. Number of replicates for Monte Carlo simulation. Defaults to 2000.
digits: Number of decimals. Defaults to 1 for percentages, 0 for counts.
styled: Logical. If TRUE (the default), returns a spicy_cross_table object (for formatted printing). If FALSE, returns a plain data.frame.
show_n: Logical. If TRUE (the default), adds marginal N totals when percent != "none".
...: Additional arguments passed to individual print methods.

Global Options

The function recognizes the following global options that modify its default behavior:

options(spicy.percent = "column") Sets the default percentage mode for all calls to cross_tab(). Valid values are "none", "row", and "column". Equivalent to setting percent = "column" (or another choice) in each call.
options(spicy.simulate_p = TRUE) Enables Monte Carlo simulation for all Chi-squared tests by default. Equivalent to setting simulate_p = TRUE in every call.
options(spicy.rescale = TRUE) Automatically rescales weights so that total weighted N equals the raw N. Equivalent to setting rescale = TRUE in each call.

These options are convenient for users who wish to enforce consistent behavior across multiple calls to cross_tab() and other spicy table functions. They can be disabled or reset by setting them to NULL: options(spicy.percent = NULL, spicy.simulate_p = NULL, spicy.rescale = NULL).

Example:

options(spicy.simulate_p = TRUE, spicy.rescale = TRUE)
cross_tab(sochealth, smoking, education, weights = weight)

Examples

Run this code

# Basic crosstab
cross_tab(sochealth, smoking, education)

# Column percentages
cross_tab(sochealth, smoking, education, percent = "column")

# Weighted (rescaled)
cross_tab(sochealth, smoking, education, weights = weight, rescale = TRUE)

# Grouped by sex
cross_tab(sochealth, smoking, education, by = sex)

# Grouped by combination of variables
cross_tab(sochealth, smoking, education, by = interaction(sex, age_group))

# Ordinal variables: auto-selects Kendall's Tau-b
cross_tab(sochealth, education, self_rated_health)

# 2x2 table with Yates correction
cross_tab(sochealth, smoking, physical_activity, correct = TRUE)

Run the code above in your browser using DataLab