Learn R Programming

scf (version 1.0.5)

scf_subset: Subset an scf_mi_survey Object

Description

Subsetting refers to the process of retaining only those observations that satisfy a logical (TRUE/FALSE) condition. This function applies such a filter independently to each implicate in an scf_mi_survey object created by scf_design() via scf_load(). The result is a new multiply-imputed, replicate-weighted survey object with appropriately restricted designs.

Usage

scf_subset(scf, expr)

Value

A new scf_mi_survey object (see scf_design())

Arguments

scf

A scf_mi_survey object, typically created by scf_load().

expr

A logical expression used to filter rows, evaluated separately in each implicate's variable frame (e.g., age < 65 & own == 1).

Implementation

Use scf_subset() to focus analysis on analytically meaningful sub-populations. For example, to analyze only households headed by seniors:


scf2022_seniors <- scf_subset(scf2022, age >= 65)

This is especially useful when analyzing populations such as renters, homeowners, specific age brackets, or any group defined by logical expressions over SCF variables.

Details

Filtering is conducted separately in each implicate. This preserves valid design structure but means that the same household may fall into or out of the subset depending on imputed values. For example, a household with five different age imputations—say, 64, 66, 63, 65, and 67—would be classified as a senior in only three of five implicates if subsetting on age >= 65.

Empty subsets in any implicate can cause downstream analysis to fail. Always check subgroup sizes after subsetting.

See Also

scf_load(), scf_update()

Examples

Run this code
# Mock workflow for CRAN (demo only — not real SCF data)
td <- tempfile("subset_")
dir.create(td)

src <- system.file("extdata", "scf2022_mock_raw.rds", package = "scf")
file.copy(src, file.path(td, "scf2022.rds"), overwrite = TRUE)
scf2022 <- scf_load(2022, data_directory = td)

# Filter for working-age households with positive net worth
scf_sub <- scf_subset(scf2022, age < 65 & networth > 0)
scf_mean(scf_sub, ~income)

# Do not implement these lines in real analysis: Cleanup for package check
unlink(td, recursive = TRUE, force = TRUE)


Run the code above in your browser using DataLab