SOCbyPT_Grade: SOC → PT summary by treatment with Grade split (wide)

Description

Summarises AEs by System Organ Class (SOC) → Preferred Term (PT) per treatment arm and splits each arm into Grade buckets (1–5 + NOT REPORTED). The table includes a first TOTAL SUBJECTS WITH AN EVENT row, optional SOC subtotal rows, and RTF-safe indenting for PT lines. The SOC/PT block order can be driven by a reference arm (e.g., TRTAN = 12) and a specific grade via sort_grade (default 5).

Usage

SOCbyPT_Grade(
  indata,
  dmdata,
  pop_data = NULL,
  group_vars,
  trtan_coln,
  grade_num = "AETOXGRN",
  grade_char = NULL,
  by_var = NULL,
  by_sort_var = NULL,
  by_sort_numeric = TRUE,
  bigN_by = NULL,
  print_bigN = FALSE,
  id_var = "USUBJID",
  rtf_safe = TRUE,
  indent_str = "(*ESC*)R/RTF\"\\li360 \"",
  use_sas_round = FALSE,
  header_blank = TRUE,
  soc_totals = FALSE,
  total_label = "TOTAL SUBJECTS WITH AN EVENT",
  nr_char_values = c("NOT REPORTED", "NOT_REPORTED", "NOTREPORTED", "NOT REPRTED", "NR",
    "N", "NA"),
  sort_grade = 5,
  debug = FALSE,
  uncoded_position = c("count", "last")
)

Value

A tibble with columns:

stat
For each treatment and each grade bucket: TRT<trt>_GRADE1, …, TRT<trt>_GRADE5, TRT<trt>_NOT_REPORTED
sort_ord, sec_ord

Arguments

indata

data.frame. AE-like data containing USUBJID, treatment, SOC, PT, and Grade variables.

dmdata

data.frame. ADSL-like data containing denominators per arm (must include USUBJID and the same treatment column as in indata).

pop_data

data.frame or NULL. Optional master population for arm Ns (defaults to dmdata).

group_vars

Character vector of length 3: c(main_trt, soc, pt). Example: c("TRTAN","AEBODSYS","AEDECOD").

trtan_coln

Character or numeric. The reference treatment code used for ordering SOC/PT blocks (e.g., "12").

grade_num

Character. Name of numeric grade column (default "AETOXGRN"). Values 1–5 are treated as valid grades; others are ignored in numeric logic.

grade_char

Character or NULL. Optional character grade column name (e.g., "AETOCGR"/"AETOXGR"). If NULL, the function auto-detects "AETOCGR" then "AETOXGR" if present.

by_var

Character or NULL. Optional BY variable (from AE dataset) to generate stratified outputs and sort independently per stratum.

by_sort_var

Character or NULL. Optional helper column to order BY strata; defaults to by_var when NULL.

by_sort_numeric

Logical. If TRUE (default), order BY strata by as.numeric(by_sort_var), else use character order.

bigN_by

Flag controlling denominator behavior when BY is used:

NULL / "NO" (default): denominators are by treatment only (not stratified by BY)
"YES": denominators are by BY × treatment (requires by_var in dmdata or pop_data)

print_bigN

If TRUE, prints denominators (Big-N) used for percent calculations to console/log.

id_var

Character. Subject ID column (default "USUBJID").

rtf_safe

Logical. If TRUE (default), prefix PT rows with indent_str.

indent_str

Character. The RTF literal for indentation of PT lines (default (*ESC*)R/RTF\"\\li360 \").

use_sas_round

Logical. If TRUE, use SAS-style rounding for percentages; else base R round().

header_blank

Logical. If TRUE (default) and soc_totals = FALSE, grade columns on SOC header rows are blanked.

soc_totals

Logical. If TRUE, include SOC subtotal rows using the same grade logic as PT rows.

total_label

Character. Label for the top row (default "TOTAL SUBJECTS WITH AN EVENT").

nr_char_values

Character vector. Values in grade_char that are considered "Not Reported". Default includes multiple NR encodings.

sort_grade

Integer or character. Grade used for ordering within the reference arm (default 5). Use "NOT REPORTED" (or any synonym in nr_char_values) to sort by NR instead.

debug

Logical. If TRUE, prints debug summaries.

uncoded_position

Character. One of c("count","last"). Controls the placement of the UNCODED block: "count" = position by counts (default); "last" = force SOC == "UNCODED" to the end (per BY stratum) and PT == "UNCODED" last within that SOC.

Key features

Grades from numeric and/or character sources: Uses grade_num (1–5). If a character grade column exists (e.g., "AETOCGR"/"AETOXGR"), it is cleaned and mapped, with values in nr_char_values treated as Not Reported.
NR logic: (a) For PT rows, a subject contributes the max numeric grade among 1–5 (NR ignored). (b) For the top TOTAL row, if any PT for the subject is NR-only (no numeric grade), the subject contributes to NOT REPORTED; otherwise to their max numeric grade.
Ordering: Within SOC/PT, order is determined using counts from the reference arm trtan_coln filtered to sort_grade (fallback = all grades).
BY support: Optional by_var (from AE) adds strata with optional by_sort_var to control strata ordering (numeric or character).
SOC totals: soc_totals = TRUE adds a SOC subtotal row (max-grade logic).
Denominators: Ns are computed from dmdata (or pop_data, if provided).
Big N behavior with BY: controlled by bigN_by (TRT-only vs BY×TRT).
RTF-safe indent: PT stat values can be indented using indent_str.
SAS-style rounding: Percentages can follow SAS “round half away from zero” via use_sas_round = TRUE.
UNCODED placement: uncoded_position = c("count","last"). With "last", the block where SOC == "UNCODED" is forced to the very end (per BY stratum), and within that SOC the PT == "UNCODED" line is forced last. Detection is case-insensitive and robust to extra spaces/non-breaking spaces.

Examples

Run this code


library(dplyr)

adae <- tibble::tribble(
  ~USUBJID, ~TRTAN, ~AEBODSYS,           ~AEDECOD,          ~AETOXGRN,
  "01",       11,   "GASTROINTESTINAL",  "NAUSEA",          2,
  "01",       11,   "GASTROINTESTINAL",  "VOMITING",        3,
  "02",       11,   "GASTROINTESTINAL",  "NAUSEA",          5,
  "03",       12,   "NERVOUS SYSTEM",    "HEADACHE",        1,
  "03",       12,   "NERVOUS SYSTEM",    "DIZZINESS",       2,
  "04",       12,   "GASTROINTESTINAL",  "NAUSEA",          4
)

adsl <- tibble::tribble(
  ~USUBJID, ~TRTAN,
  "01",       11,
  "02",       11,
  "03",       12,
  "04",       12
)

out1 <- SOCbyPT_Grade(
  indata     = adae,
  dmdata     = adsl,
  group_vars = c("TRTAN", "AEBODSYS", "AEDECOD"),
  trtan_coln = "12"   # reference arm for ordering
)

out1



out2 <- SOCbyPT_Grade(
  indata       = adae,
  dmdata       = adsl,
  group_vars   = c("TRTAN", "AEBODSYS", "AEDECOD"),
  trtan_coln   = "12",
  soc_totals   = TRUE,
  header_blank = TRUE
)

out2


adae2 <- tibble::tribble(
  ~USUBJID, ~TRTAN, ~AEBODSYS,          ~AEDECOD,     ~AETOXGRN, ~AETOXGR,
  "01",       11,   "GASTROINTESTINAL", "NAUSEA",     2,         "",
  "02",       11,   "GASTROINTESTINAL", "NAUSEA",     NA,        "NR",
  "03",       12,   "NERVOUS SYSTEM",   "HEADACHE",   3,         NA,
  "04",       12,   "UNCODED",          "UNCODED",    NA,        "NOT REPORTED"
)

out3 <- SOCbyPT_Grade(
  indata       = adae2,
  dmdata       = adsl,
  group_vars   = c("TRTAN", "AEBODSYS", "AEDECOD"),
  trtan_coln   = "12",
  grade_num    = "AETOXGRN",
  grade_char   = "AETOXGR",
  sort_grade   = "NOT REPORTED",
  rtf_safe     = FALSE,
  uncoded_position = "last"
)

out3


adae_sex <- tibble::tribble(
  ~USUBJID, ~TRTAN, ~SEX, ~AEBODSYS,          ~AEDECOD,    ~AETOXGRN,
  "01",       11,   "M",  "GASTROINTESTINAL", "NAUSEA",    2,
  "02",       11,   "F",  "GASTROINTESTINAL", "NAUSEA",    5,
  "03",       12,   "M",  "NERVOUS SYSTEM",   "HEADACHE",  3,
  "04",       12,   "F",  "NERVOUS SYSTEM",   "DIZZINESS", 1
)

adsl_sex <- tibble::tribble(
  ~USUBJID, ~TRTAN, ~SEX,
  "01",       11,   "M",
  "02",       11,   "F",
  "03",       12,   "M",
  "04",       12,   "F"
)

out4_trtN <- SOCbyPT_Grade(
  indata     = adae_sex,
  dmdata     = adsl_sex,
  group_vars = c("TRTAN", "AEBODSYS", "AEDECOD"),
  trtan_coln = "12",
  by_var     = "SEX",
  bigN_by    = "NO",
  print_bigN = TRUE
)

out4_byN <- SOCbyPT_Grade(
  indata     = adae_sex,
  dmdata     = adsl_sex,
  group_vars = c("TRTAN", "AEBODSYS", "AEDECOD"),
  trtan_coln = "12",
  by_var     = "SEX",
  bigN_by    = "YES",
  print_bigN = TRUE
)

out4_trtN
out4_byN

Run the code above in your browser using DataLab