tests: Tests/methods available in `add_p()` and `add_difference()`

Description

Below is a listing of tests available internally within gtsummary.

Tests listed with ... may have additional arguments passed to them using add_p(test.args=). For example, to calculate a p-value from t.test() assuming equal variance, use tbl_summary(trial, by = trt) %>% add_p(age ~ "t.test", test.args = age ~ list(var.equal = TRUE))

Arguments

tbl_summary() %>% add_p()

alias	description	pseudo-code	details
`"t.test"`	t-test	`t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)`
`"aov"`	One-way ANOVA	`aov(variable ~ as.factor(by), data = data) %>% summary()`
`"oneway.test"`	One-way ANOVA	`oneway.test(variable ~ as.factor(by), data = data, ...)`
`"kruskal.test"`	Kruskal-Wallis test	`kruskal.test(data[[variable]], as.factor(data[[by]]))`
`"wilcox.test"`	Wilcoxon rank-sum test	`wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, ...)`
`"chisq.test"`	chi-square test of independence	`chisq.test(x = data[[variable]], y = as.factor(data[[by]]), ...)`
`"chisq.test.no.correct"`	chi-square test of independence	`chisq.test(x = data[[variable]], y = as.factor(data[[by]]), correct = FALSE)`
`"fisher.test"`	Fisher's exact test	`fisher.test(data[[variable]], as.factor(data[[by]]), conf.level = 0.95, ...)`
`"mcnemar.test"`	McNemar's test	`tidyr::pivot_wider(id_cols = group, ...); mcnemar.test(by_1, by_2, conf.level = 0.95, ...)`
`"mcnemar.test.wide"`	McNemar's test	`mcnemar.test(data[[variable]], data[[by]], conf.level = 0.95, ...)`
`"lme4"`	random intercept logistic regression	`lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(by ~ variable + (1 \UFF5C group), data, family = binomial))`
`"paired.t.test"`	Paired t-test	`tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)`
`"paired.wilcox.test"`	Paired Wilcoxon rank-sum test	`tidyr::pivot_wider(id_cols = group, ...); wilcox.test(by_1, by_2, paired = TRUE, conf.int = TRUE, conf.level = 0.95, ...)`
`"prop.test"`	Test for equality of proportions	`prop.test(x, n, conf.level = 0.95, ...)`
`"ancova"`	ANCOVA	`lm(variable ~ by + adj.vars)`
`"emmeans"`	Estimated Marginal Means or LS-means	`lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `glm(family = binomial)` and `emmeans(regrid = "response")` arguments are used. When `group` is specified, `lme4::lmer()` and `lme4::glmer()` are used with the group as a random intercept.

tbl_svysummary() %>% add_p()

alias	description	pseudo-code	details
`"svy.t.test"`	t-test adapted to complex survey samples	`survey::svyttest(~variable + by, data)`
`"svy.wilcox.test"`	Wilcoxon rank-sum test for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'wilcoxon')`
`"svy.kruskal.test"`	Kruskal-Wallis rank-sum test for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'KruskalWallis')`
`"svy.vanderwaerden.test"`	van der Waerden's normal-scores test for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'vanderWaerden')`
`"svy.median.test"`	Mood's test for the median for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'median')`
`"svy.chisq.test"`	chi-squared test with Rao & Scott's second-order correction	`survey::svychisq(~variable + by, data, statistic = 'F')`
`"svy.adj.chisq.test"`	chi-squared test adjusted by a design effect estimate	`survey::svychisq(~variable + by, data, statistic = 'Chisq')`
`"svy.wald.test"`	Wald test of independence for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'Wald')`
`"svy.adj.wald.test"`	adjusted Wald test of independence for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'adjWald')`
`"svy.lincom.test"`	test of independence using the exact asymptotic distribution for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'lincom')`
`"svy.saddlepoint.test"`	test of independence using a saddlepoint approximation for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'saddlepoint')`
`"emmeans"`	Estimated Marginal Means or LS-means	`survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `survey::svyglm(family = binomial)` and `emmeans(regrid = "response")` arguments are used.

tbl_survfit() %>% add_p()

alias	description	pseudo-code
`"logrank"`	Log-rank test	`survival::survdiff(Surv(.) ~ variable, data, rho = 0)`
`"tarone"`	Tarone-Ware test	`survival::survdiff(Surv(.) ~ variable, data, rho = 1.5)`
`"petopeto_gehanwilcoxon"`	Peto & Peto modification of Gehan-Wilcoxon test	`survival::survdiff(Surv(.) ~ variable, data, rho = 1)`
`"survdiff"`	G-rho family test	`survival::survdiff(Surv(.) ~ variable, data, ...)`
`"coxph_lrt"`	Cox regression (LRT)	`survival::coxph(Surv(.) ~ variable, data, ...)`
`"coxph_wald"`	Cox regression (Wald)	`survival::coxph(Surv(.) ~ variable, data, ...)`
`"coxph_score"`	Cox regression (Score)	`survival::coxph(Surv(.) ~ variable, data, ...)`

tbl_continuous() %>% add_p()

alias	description	pseudo-code
`"anova_2way"`	Two-way ANOVA	`lm(continuous_variable ~ by + variable)`
`"t.test"`	t-test	`t.test(continuous_variable ~ as.factor(variable), data = data, conf.level = 0.95, ...)`
`"aov"`	One-way ANOVA	`aov(continuous_variable ~ as.factor(variable), data = data) %>% summary()`
`"kruskal.test"`	Kruskal-Wallis test	`kruskal.test(data[[continuous_variable]], as.factor(data[[variable]]))`
`"wilcox.test"`	Wilcoxon rank-sum test	`wilcox.test(as.numeric(continuous_variable) ~ as.factor(variable), data = data, ...)`
`"lme4"`	random intercept logistic regression	`lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(variable ~ continuous_variable + (1 \UFF5C group), data, family = binomial))`
`"ancova"`	ANCOVA	`lm(continuous_variable ~ variable + adj.vars)`

tbl_summary() %>% add_difference()

alias	description	difference statistic	pseudo-code	details
`"t.test"`	t-test	mean difference	`t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)`
`"paired.t.test"`	Paired t-test	mean difference	`tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)`
`"prop.test"`	Test for equality of proportions	rate difference	`prop.test(x, n, conf.level = 0.95, ...)`
`"ancova"`	ANCOVA	mean difference	`lm(variable ~ by + adj.vars)`
`"ancova_lme4"`	ANCOVA with random intercept	mean difference	`lme4::lmer(variable ~ by + adj.vars + (1 \UFF5C group), data)`
`"cohens_d"`	Cohen's D	standardized mean difference	`effectsize::cohens_d(variable ~ by, data, ci = conf.level, ...)`
`"smd"`	Standardized Mean Difference	standardized mean difference	`smd::smd(x = data[[variable]], g = data[[by]], std.error = TRUE)`
`"emmeans"`	Estimated Marginal Means or LS-means	adjusted mean difference	`lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `glm(family = binomial)` and `emmeans(regrid = "response")` arguments are used. When `group` is specified, `lme4::lmer()` and `lme4::glmer()` are used with the group as a random intercept.

tbl_svysummary() %>% add_difference()

alias	description	difference statistic	pseudo-code	details
`"smd"`	Standardized Mean Difference	standardized mean difference	`smd::smd(x = data$variables[[variable]], g = data$variables[[by]], w = weights(data), std.error = TRUE)`
`"emmeans"`	Estimated Marginal Means or LS-means	adjusted mean difference	`survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `survey::svyglm(family = binomial)` and `emmeans(regrid = "response")` arguments are used.

Custom Functions

To report a p-value (or difference) for a test not available in gtsummary, you can create a custom function. The output is a data frame that is one line long. The structure is similar to the output of broom::tidy() of a typical statistical test. The add_p() and add_comparison() functions will look for columns called "p.value", "estimate", "conf.low", "conf.high", and "method" for the p-value, difference, confidence interval, and the test name used in the footnote.

Example calculating a p-value from a t-test assuming a common variance between groups.

ttest_common_variance <- function(data, variable, by, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), var.equal = TRUE) %>%
  broom::tidy()
}
trial[c("age", "trt")] %>%
  tbl_summary(by = trt) %>%
  add_p(test = age ~ "ttest_common_variance")

A custom add_difference() is similar, and accepts arguments conf.level= and adj.vars= as well.

ttest_common_variance <- function(data, variable, by, conf.level, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), conf.level = conf.level, var.equal = TRUE) %>%
  broom::tidy()
}

Function Arguments

For tbl_summary() objects, the custom function will be passed the following arguments: custom_pvalue_fun(data=, variable=, by=, group=, type=, conf.level=, adj.vars=). While your function may not utilize each of these arguments, these arguments are passed and the function must accept them. We recommend including ... to future-proof against updates where additional arguments are added.

The following table describes the argument inputs for each gtsummary table type.

argument	tbl_summary	tbl_svysummary	tbl_survfit	tbl_continuous
`data=`	A data frame	A survey object	A `survfit()` object	A data frame
`variable=`	String variable name	String variable name	`NA`	String variable name
`by=`	String variable name	String variable name	`NA`	String variable name
`group=`	String variable name	`NA`	`NA`	String variable name
`type=`	Summary type	Summary type	`NA`	`NA`
`conf.level=`	Confidence interval level	`NA`	`NA`	`NA`
`adj.vars=`	Character vector of adjustment variable names (e.g. used in ANCOVA)	`NA`	`NA`	Character vector of adjustment variable names (e.g. used in ANCOVA)
`continuous_variable=`	`NA`	`NA`	`NA`	String of the continuous variable name

Description

Arguments

tbl_summary() %&gt;% add_p()

tbl_svysummary() %&gt;% add_p()

tbl_survfit() %&gt;% add_p()

tbl_continuous() %&gt;% add_p()

tbl_summary() %&gt;% add_difference()

tbl_svysummary() %&gt;% add_difference()