scf_freq: Estimate the Frequencies of a Discrete Variable from SCF Microdata

Description

This function estimates the relative frequency (proportion) of each category in a discrete variable from the SCF public-use microdata. Use this function to discern the univariate distribution of a discrete variable.

Usage

scf_freq(scf, var, by = NULL, percent = TRUE)

Value

A list of class "scf_freq" with:

results: Pooled category proportions and standard errors, by group if specified.
imps: A named list of implicate-level proportion estimates.
aux: Metadata about the variable and grouping structure.

Arguments

scf: A scf_mi_survey object created by scf_load(). Must contain five replicate-weighted implicates.
var: A one-sided formula specifying a categorical variable (e.g., ~racecl).
by: Optional one-sided formula specifying a discrete grouping variable (e.g., ~own).
percent: Logical. If TRUE (default), scales results and standard errors to percentages.

Details

Proportions are estimated within each implicate using survey::svymean(), then pooled using the standard MI formula for proportions. When a grouping variable is provided via by, estimates are produced separately for each group-category combination. Results may be scaled to percentages using the percent argument.

Estimates are pooled using the standard formula:

The mean of implicate-level proportions is the point estimate
The standard error reflects both within-implicate variance and across-implicate variation

Unlike means or model parameters, category proportions do not use Rubin's full combination rules (e.g., degrees of freedom).

Examples

Run this code

# Ignore this code block.  It loads mock data for CRAN.
# In your analysis, download and load your data using the
# functions `scf_download()` and `scf_load()`
td <- tempfile("freq_")
dir.create(td)

src <- system.file("extdata", "scf2022_mock_raw.rds", package = "scf")
file.copy(src, file.path(td, "scf2022.rds"), overwrite = TRUE)
scf2022 <- scf_load(2022, data_directory = td)

# EXAMPLE IMPLEMENTATION: Proportions of homeownership
scf_freq(scf2022, ~own)

# EXAMPLE IMPLEMENTATION: Cross-tabulate education by homeownership
scf_freq(scf2022, ~own, by = ~edcl)

# Ignore the code below.  It is for CRAN:
unlink(td, recursive = TRUE, force = TRUE)