Use these functions to calculate multiple summaries of nested or hierarchical data
in a single call.
ard_stack_hierarchical()
: Calculates rates of events (e.g. adverse events)
utilizing the denominator
and id
arguments to identify the rows in data
to include in each rate calculation.
ard_stack_hierarchical_count()
: Calculates counts of events utilizing
all rows for each tabulation.
ard_stack_hierarchical(
data,
variables,
by = dplyr::group_vars(data),
id,
denominator,
include = everything(),
statistic = everything() ~ c("n", "N", "p"),
overall = FALSE,
over_variables = FALSE,
attributes = FALSE,
total_n = FALSE,
shuffle = FALSE
)ard_stack_hierarchical_count(
data,
variables,
by = dplyr::group_vars(data),
denominator = NULL,
include = everything(),
overall = FALSE,
over_variables = FALSE,
attributes = FALSE,
total_n = FALSE,
shuffle = FALSE
)
an ARD data frame of class 'card'
(data.frame
)
a data frame
(tidy-select
)
Specifies the nested/hierarchical structure of the data.
The variables that are specified here and in the include
argument
will have summary statistics calculated.
(tidy-select
)
variables to perform tabulations by. All combinations of the variables
specified here appear in results. Default is dplyr::group_vars(data)
.
(tidy-select
)
argument used to subset data
to identify rows in data
to calculate
event rates in ard_stack_hierarchical()
. See details below.
(data.frame
, integer
)
used to define the denominator and enhance the output.
The argument is required for ard_stack_hierarchical()
and optional
for ard_stack_hierarchical_count()
.
the univariate tabulations of the by
variables are calculated with denominator
,
when a data frame is passed, e.g. tabulation of the treatment assignment
counts that may appear in the header of a table.
the denominator
argument must be specified when id
is used to
calculate the event rates.
if total_n=TRUE
, the denominator
argument is used to return the total N
(tidy-select
)
Specify the subset a columns indicated in the variables
argument for which
summary statistics will be returned. Default is everything()
.
(formula-list-selector
)
a named list, a list of formulas,
or a single formula where the list element one or more of c("n", "N", "p")
(or the RHS of a formula).
(scalar logical
)
logical indicating whether overall statistics
should be calculated (i.e. repeat the operations with by=NULL
in most cases, see below for details).
Default is FALSE
.
(scalar logical
)
logical indicating whether summary statistics
should be calculated over or across the columns listed in the variables
argument.
Default is FALSE
.
(scalar logical
)
logical indicating whether to include the results of ard_attributes()
for all
variables represented in the ARD. Default is FALSE
.
(scalar logical
)
logical indicating whether to include of ard_total_n(denominator)
in the returned ARD.
(scalar logical
)
logical indicating whether to perform shuffle_ard()
on the final result.
Default is FALSE
.
To calculate event rates, the ard_stack_hierarchical()
function identifies
rows to include in the calculation.
First, the primary data frame is sorted by the columns identified in
the id
, by
, and variables
arguments.
As the function cycles over the variables specified in the variables
argument,
the data frame is grouped by id
, intersect(by, names(denominator))
, and variables
utilizing the last row within each of the groups.
For example, if the call is
ard_stack_hierarchical(data = ADAE, variables = c(AESOC, AEDECOD), id = USUBJID)
,
then we'd first subset ADAE to be one row within the grouping c(USUBJID, AESOC, AEDECOD)
to calculate the event rates in 'AEDECOD'
. We'd then repeat and
subset ADAE to be one row within the grouping c(USUBJID, AESOC)
to calculate the event rates in 'AESOC'
.
When we set overall=TRUE
, we wish to re-run our calculations removing the
stratifying columns. For example, if we ran the code below, we results would
include results with the code chunk being re-run with by=NULL
.
ard_stack_hierarchical(
data = ADAE,
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL |> dplyr::rename(TRTA = ARM),
overall = TRUE
)
But there is another case to be aware of: when the by
argument includes
columns that are not present in the denominator
, for example when tabulating
results by AE grade or severity in addition to treatment assignment.
In the example below, we're tabulating results by treatment assignment and
AE severity. By specifying overall=TRUE
, we will re-run the to get
results with by = AESEV
and again with by = NULL
.
ard_stack_hierarchical(
data = ADAE,
variables = c(AESOC, AEDECOD),
by = c(TRTA, AESEV),
denominator = ADSL |> dplyr::rename(TRTA = ARM),
overall = TRUE
)
ard_stack_hierarchical(
ADAE,
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL |> dplyr::rename(TRTA = ARM),
id = USUBJID
)
ard_stack_hierarchical_count(
ADAE,
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL |> dplyr::rename(TRTA = ARM)
)
Run the code above in your browser using DataLab