Learn R Programming

surveycore (version 0.8.3)

get_quantiles: Survey-Weighted Quantiles

Description

Compute survey-weighted quantiles (including the median) for a single numeric variable using the Woodruff (1952) confidence interval method. Supports optional grouping, domain estimation, and all five survey design classes.

Usage

get_quantiles(
  design,
  x,
  probs = c(0.25, 0.5, 0.75),
  group = NULL,
  variance = "ci",
  conf_level = 0.95,
  n_weighted = FALSE,
  decimals = NULL,
  min_cell_n = 30L,
  na.rm = TRUE,
  label_values = TRUE,
  label_vars = TRUE,
  name_style = "surveycore",
  ...,
  .id = NULL,
  .if_missing_var = NULL
)

Value

A survey_quantiles tibble (also inheriting survey_result).

  • [group_cols...] — group variable columns (when active), first.

  • quantile — probability label: "p25", "p50", etc.

  • estimate — weighted quantile estimate.

  • Variance columns (se, var, cv, ci_low, ci_high, moe, deff) — only those requested via variance. CIs are Woodruff intervals and are generally asymmetric around estimate. deff is always NA for quantile estimates: computing it requires a kernel density estimate at the quantile point (the Woodruff SRS approximation used by survey::svyquantile(deff = TRUE)), which is not implemented.

  • n — unweighted count of non-NA observations used in the estimate.

  • n_weighted — sum of weights (only when requested).

One row per (group combination × quantile probability). The variable name and probs vector are stored in meta(result).

Arguments

design

A survey design object: survey_taylor, survey_replicate, survey_twophase, or survey_nonprob.

x

<tidy-select> A single unquoted numeric variable name. Must resolve to exactly one numeric column.

probs

Numeric vector of probabilities in (0, 1). Default c(0.25, 0.5, 0.75) (IQR + median).

group

<tidy-select> Optional grouping variable(s). Combined with any grouping set by group_by(). Default NULL.

variance

NULL or a character vector from "se", "ci", "var", "cv", "moe", "deff". Controls which uncertainty columns appear in the output. CIs use the Woodruff (1952) back-transformation method and are not symmetric around the estimate. "deff" is always NA for quantiles (no closed-form SRS SE). Default "ci".

conf_level

Numeric scalar in (0, 1). Confidence level for Woodruff intervals. Default 0.95.

n_weighted

Logical. If TRUE, add an n_weighted column with the sum of weights for non-NA observations in each group. Default FALSE.

decimals

Integer or NULL. If an integer, rounds all numeric output columns (e.g., estimate, se, ci_low, ci_high) to this many decimal places. Default NULL (no rounding).

min_cell_n

Integer. Minimum unweighted cell count before surveycore_warning_small_cell fires. Default 30L (AAPOR guidance).

na.rm

Logical. If TRUE (default), NA values are excluded from analysis: observations where the analysis variable is NA are dropped from calculations, and observations where any group variable is NA are excluded from the output. If FALSE, NA observations in the analysis variable are included in calculations, and observations where a group variable is NA are collected into their own group row in the output (appearing after all non-NA group rows).

label_values

Logical. Accepted for API uniformity; has no visible effect on get_quantiles() output. Default TRUE.

label_vars

Logical. Accepted for API uniformity; has no visible effect on get_quantiles() output. Default TRUE.

name_style

"surveycore" (default) or "broom". When "broom", renames sestd.error, ci_lowconf.low, ci_highconf.high. The estimate column is unchanged.

...

Unused. Reserved so that .id and .if_missing_var remain named-only when a survey_collection is passed as design.

.id

Character(1) or NULL. Column name used to identify each survey when design is a survey_collection. For collection inputs, NULL (the default) resolves to the collection's stored @id property. Pass a non-NULL value to override. Ignored when design is a single survey.

.if_missing_var

"error", "skip", or NULL. How to handle surveys in a collection that lack one of the requested NSE variables. For collection inputs, NULL (the default) resolves to the collection's stored @if_missing_var property. Pass a non-NULL value to override. Ignored when design is a single survey.

References

Woodruff, R. S. (1952). Confidence intervals for medians and other position measures. Journal of the American Statistical Association, 47(260), 635–646.

See Also

Other analysis: clean(), get_anova(), get_corr(), get_covariance(), get_diffs(), get_freqs(), get_means(), get_pairwise(), get_ratios(), get_t_test(), get_totals(), get_variance(), meta()

Examples

Run this code
d <- as_survey(nhanes_2017, ids = sdmvpsu, weights = wtint2yr,
               strata = sdmvstra, nest = TRUE)

# IQR + median (default)
get_quantiles(d, ridageyr)

# Median only with SE
get_quantiles(d, ridageyr, probs = 0.5, variance = c("ci", "se"))

# Grouped quartiles
get_quantiles(d, ridageyr, group = riagendr)

Run the code above in your browser using DataLab