Learn R Programming

scf (version 1.0.4)

scf_percentile: Estimate Percentile in a Continuous Variable in SCF Microdata

Description

Calculates the percentile score of a continuous variable in the SCF microdata. Use this function to either (1) identify where a continuous variable's value stands in relation to all observed values, or (2) to discern value below which a user-specified percentage of households fall on that metric.

Usage

scf_percentile(scf, var, q = 0.5, by = NULL, verbose = FALSE)

Value

A list of class "scf_percentile" with:

results

Pooled percentile estimates with standard errors and range across implicates. One row per group, or one row total.

imps

A named list of implicate-level estimates.

aux

Variable, group, and quantile metadata.

Arguments

scf

A scf_mi_survey object created with scf_load(). Must contain five implicates.

var

A one-sided formula identifying the continuous variable to summarize (e.g., ~networth).

q

A quantile to estimate (between 0 and 1). Defaults to 0.5 (median).

by

Optional one-sided formula specifying a discrete grouping variable for stratified percentiles.

verbose

Logical. If TRUE, include implicate-level results in print output. Default is FALSE.

Details

The percentile is a value below which a given percentage of observations fall. This function estimates the desired percentile score within each implicate of the SCF’s multiply-imputed dataset, and then averages them to generate a population estimate.

When a grouping variable is supplied, the percentile is estimated separately within each group in each implicate. Group-level results are then pooled across implicates.

Unlike scf_mean(), this function does not pool results using Rubin’s Rules. Instead, it follows the Federal Reserve’s practice for reporting percentiles in official SCF publications: compute the desired percentile separately within each implicate, then average the resulting values to obtain a pooled estimate.

Standard errors are approximated using the sample standard deviation of the five implicate-level estimates. This method is consistent with the SCF's official percentile macro (see Kennickell 1998; per Federal Reserve Board's 2022 SCF's official SAS script)

References

Kennickell AB, McManus DA, Woodburn RL. Weighting design for the 1992 Survey of Consumer Finances. U.S. Federal Reserve. https://www.federalreserve.gov/Pubs/OSS/oss2/papers/weight92.pdf

U.S. Federal Reserve. Codebook for 2022 Survey of Consumer Finances. https://www.federalreserve.gov/econres/scfindex.htm

See Also

scf_median()

Examples

Run this code
# Do not implement these lines in real analysis:
# Use functions `scf_download()` and `scf_load()`
td  <- tempdir()
src <- system.file("extdata", "scf2022_mock_raw.rds", package = "scf")
file.copy(src, file.path(td, "scf2022.rds"), overwrite = TRUE)
scf2022 <- scf_load(2022, data_directory = td)

# Example for real analysis: Estimate percentiles
scf_percentile(scf2022, ~networth, q = 0.5)
scf_percentile(scf2022, ~networth, q = 0.9, by = ~edcl)

# Do not implement these lines in real analysis: Cleanup for package check
unlink("scf2022.rds", force = TRUE)

Run the code above in your browser using DataLab