Learn R Programming

sumExtras (version 0.3.0)

add_auto_labels: Add automatic labels from dictionary to a gtsummary table

Description

Automatically apply variable labels from a dictionary or label attributes to tbl_summary, tbl_svysummary, or tbl_regression objects. Intelligently preserves manual label overrides set in the original table call while applying dictionary labels or reading label attributes from data. The dictionary can be passed explicitly or will be searched for in the calling environment. If no dictionary is found, the function will attempt to read label attributes from the underlying data.

Usage

add_auto_labels(tbl, dictionary)

Value

A gtsummary table object with labels applied. Manual labels set via label = list(...) in the original table call are always preserved.

Arguments

tbl

A gtsummary table object created by tbl_summary(), tbl_svysummary(), or tbl_regression().

dictionary

A data frame or tibble with Variable and Description columns. If not provided (missing), the function will search for a dictionary object in the calling environment. If no dictionary is found, the function will attempt to read label attributes from the data. Set to NULL explicitly to skip dictionary search and only use attributes.

Options

Set options(sumExtras.preferDictionary = TRUE) to prioritize dictionary labels over label attributes when both are available. Default is FALSE, which prioritizes attributes over dictionary labels.

Details

Label Priority Hierarchy

The function applies labels according to this priority (highest to lowest):

  1. Manual labels - Labels set via label = list(...) in tbl_summary() etc. are always preserved

  2. Dictionary vs Attributes - Controlled by options(sumExtras.preferDictionary):

    • If TRUE: Dictionary labels take precedence over attribute labels

    • If FALSE (default): Attribute labels take precedence over dictionary labels

  3. Default - If no label source is available, uses variable name

Dictionary Format

The dictionary must be a data frame with columns:

  • Variable: Character column with exact variable names from datasets

  • Description: Character column with human-readable labels

Label Attributes

The function reads label attributes from data using attr(data$var, "label"), following the same label convention used by haven, Hmisc, and ggplot2 4.0+.

Your data may already have labels from various sources - imported from statistical software packages, set by other R packages, added manually, or from collaborative projects. This function discovers and applies them seamlessly within gtsummary tables.

Because sumExtras uses native R's attribute storage, labels work across any package that respects the "label" attribute convention, including:

  • ggplot2 4.0+ - automatic axis and legend labels

  • gt - table label support

  • Hmisc - label utilities and display functions

This approach requires zero package dependencies and is fully compatible with the labelled package if you choose to use it, but does not require it.

Implementation Note

This function relies on internal gtsummary structures (tbl$call_list, tbl$inputs, tbl$table_body) to detect manually set labels. While robust error handling is implemented, major updates to gtsummary may require corresponding updates to sumExtras. Requires gtsummary >= 1.7.0.

See Also

  • apply_labels_from_dictionary() for setting label attributes on data for ggplot2/other packages

  • gtsummary::modify_table_body() for advanced table customization

Other labeling functions: apply_labels_from_dictionary()

Examples

Run this code
# \donttest{
# Create a dictionary
my_dict <- tibble::tribble(
  ~Variable, ~Description,
  "age", "Age at Enrollment",
  "trt", "Treatment Group",
  "grade", "Tumor Grade"
)

# Basic usage: pass dictionary explicitly
gtsummary::trial |>
  gtsummary::tbl_summary(by = trt, include = c(age, grade)) |>
  add_auto_labels(dictionary = my_dict)

# Automatic dictionary search (dictionary in environment)
dictionary <- my_dict
gtsummary::trial |>
  gtsummary::tbl_summary(by = trt, include = c(age, grade)) |>
  add_auto_labels()  # Finds dictionary automatically

# Working with pre-labeled data (no dictionary needed)
labeled_data <- gtsummary::trial
attr(labeled_data$age, "label") <- "Patient Age (years)"
attr(labeled_data$marker, "label") <- "Marker Level (ng/mL)"

labeled_data |>
  gtsummary::tbl_summary(include = c(age, marker)) |>
  add_auto_labels()  # Reads from label attributes

# Manual overrides always win
gtsummary::trial |>
  gtsummary::tbl_summary(
    by = trt,
    include = c(age, grade),
    label = list(age ~ "Custom Age Label")  # Manual override
  ) |>
  add_auto_labels(dictionary = my_dict)  # grade gets dict label, age keeps manual

# Control priority with options
options(sumExtras.preferDictionary = TRUE)  # Dictionary over attributes

# Data has both dictionary and attributes
labeled_trial <- gtsummary::trial
attr(labeled_trial$age, "label") <- "Age from Attribute"
dictionary <- tibble::tribble(
  ~Variable, ~Description,
  "age", "Age from Dictionary"
)

labeled_trial |>
  gtsummary::tbl_summary(include = age) |>
  add_auto_labels()  # Uses "Age from Dictionary" (option = TRUE)
# }

Run the code above in your browser using DataLab