get_diffs: Treatment Effect Estimation for Survey Designs

Description

Estimates treatment effects (differences from a reference group) via survey-weighted regression. Supports bivariate and multivariate models, Gaussian and non-Gaussian families, and optional subgroup analysis.

Usage

get_diffs(
  design,
  x,
  treats,
  group = NULL,
  covariates = NULL,
  ref_level = NULL,
  pval_adj = NULL,
  show_means = TRUE,
  show_pct_change = FALSE,
  scale = c("ame", "link"),
  variance = "ci",
  conf_level = 0.95,
  min_cell_n = 30L,
  n_weighted = FALSE,
  decimals = NULL,
  na.rm = TRUE,
  label_values = TRUE,
  name_style = "surveycore",
  ...,
  .id = NULL,
  .if_missing_var = NULL
)

Value

A survey_diffs tibble (also inheriting survey_result). Columns (in order): group columns (when active), treatment variable, estimate, pct_change (optional), mean (optional), n, n_weighted (optional), se (optional), ci_low (optional), ci_high (optional), p_value, stars. Use meta() to access design type, family, reference level, and other metadata.

Arguments

design: A survey design object: survey_taylor, survey_replicate, survey_twophase, or survey_nonprob.
x: <tidy-select> A single unquoted numeric variable name for the dependent variable. Must resolve to exactly one numeric column (continuous or 0/1 binary).
treats: <tidy-select> A single unquoted variable name for the treatment/group variable. Must resolve to exactly one column with at least 2 unique levels. Coerced to factor if not already.
group: <tidy-select> Optional subgroup variable(s) for interaction analysis. When provided, treatment effects are reported separately within each subgroup. Combined with any grouping set by group_by(). Default NULL.
covariates: Character vector of additional model terms as strings. Supports interactions ("age * gender"), polynomials ("poly(edu, 2)"), and transformations ("log(income)"). When provided, forces the marginaleffects estimation path. Default NULL.
ref_level: Character(1). Reference level of treats for comparisons. If NULL (default), the first factor level is used. Must match an existing level.
pval_adj: Character(1) or NULL. P-value adjustment method passed to stats::p.adjust(). Options: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". NULL = no adjustment. When group is active, adjustment is applied independently within each group.
show_means: Logical. If TRUE (default), includes a mean column and a reference row with estimate = 0. Subject to link-scale suppression (see Details).
show_pct_change: Logical. If TRUE, includes a pct_change column: estimate / reference_mean. Subject to link-scale suppression (see Details). Default FALSE.
scale: Character(1). "ame" (default): average marginal effects on the response scale. "link": coefficients on the link scale. For Gaussian/identity models, both are identical. Case-sensitive.
variance: NULL or a character vector of one or more of "se", "ci". Controls which uncertainty columns appear. Default "ci".
conf_level: Numeric(1) in (0, 1). Confidence level. Default 0.95.
min_cell_n: Integer(1). Minimum unweighted cell size before surveycore_warning_small_cell fires. Default 30L.
n_weighted: Logical. If TRUE, includes an n_weighted column with sum of weights per treatment level. Default FALSE.
decimals: Integer(1) or NULL. If non-NULL, rounds numeric output columns. pct_change is rounded to decimals + 2. Default NULL.
na.rm: Logical. If TRUE (default), rows with NA in x, treats, or group are dropped before fitting. If FALSE, NA values cause an error.
label_values: Logical. If TRUE (default), the treats and group columns display value labels from metadata instead of raw codes. Output type is factor when labels are applied.
name_style: "surveycore" (default) or "broom". When "broom", renames se to std.error, ci_low to conf.low, etc. The mean column is excluded from renaming.
...: Passed to survey_glm(). Common uses: family = quasibinomial().
.id: Character(1) or NULL. Column name used to identify each survey when design is a survey_collection. For collection inputs, NULL (the default) resolves to the collection's stored @id property. Pass a non-NULL value to override. Ignored when design is a single survey.
.if_missing_var: "error", "skip", or NULL. How to handle surveys in a collection that lack one of the requested NSE variables. For collection inputs, NULL (the default) resolves to the collection's stored @if_missing_var property. Pass a non-NULL value to override. Ignored when design is a single survey.

Details

Estimation Paths

get_diffs() uses two estimation paths:

Clean path (bivariate Gaussian, no group): extracts coefficients directly from clean(). The intercept is the reference group mean; treatment coefficients are differences from reference.
Marginaleffects path (covariates, non-Gaussian with scale = "ame", or group): uses avg_slopes() for estimates and avg_predictions() for means.

Link-Scale Suppression

When scale = "link" and the family is non-Gaussian, the mean and pct_change columns are suppressed (omitted entirely). Link-scale means are not substantively meaningful.

P-Value Adjustment

When group is active, p-value adjustment is applied independently within each group. For global adjustment across all comparisons, apply stats::p.adjust() to the result manually. Confidence intervals reflect the specified conf_level and are not affected by p-value adjustment.

Degrees of Freedom

All p-values and confidence intervals use the t-distribution with design-based residual degrees of freedom, regardless of estimation path.

Non-Gaussian Models

By default, non-Gaussian models report average marginal effects on the response scale. Set scale = "link" for coefficients on the link scale (e.g., log-odds for logistic regression).

Examples

Run this code

library(marginaleffects)

# Create survey design with treatment groups
set.seed(42)
df <- data.frame(
  id = 1:200, wt = runif(200, 0.5, 2),
  dv = rnorm(200, 50, 10),
  arm = factor(sample(c("Control", "A", "B"), 200, TRUE))
)
d <- as_survey(df, weights = wt)

# Basic treatment effect
get_diffs(d, dv, arm)

# With percentage change and p-value adjustment
get_diffs(d, dv, arm, show_pct_change = TRUE, pval_adj = "BH")

Run the code above in your browser using DataLab