Learn R Programming

scf (version 1.0.5)

scf_plot_hist: Histogram of a Continuous Variable in Multiply-Imputed SCF Data

Description

Produces a histogram of a continuous SCF variable by binning across implicates, pooling weighted bin counts using scf_freq(), and plotting the result. Values outside xlim are clamped into the nearest endpoint to ensure all observations are included and replicate-weighted bins remain stable.

Usage

scf_plot_hist(
  design,
  variable,
  bins = 30,
  xlim = NULL,
  title = NULL,
  xlab = NULL,
  ylab = "Weighted Count",
  fill = "#0072B2"
)

Value

A ggplot2 object representing the Rubin-pooled histogram.

Arguments

design

A scf_mi_survey object from scf_load().

variable

A one-sided formula indicating the numeric variable to plot.

bins

Number of bins (default: 30).

xlim

Optional numeric range. Values outside will be included in edge bins.

title

Optional plot title.

xlab

Optional x-axis label. Defaults to the variable name.

ylab

Optional y-axis label. Defaults to "Weighted Count".

fill

Fill color for bars (default: "#0072B2").

Implementation

This function bins a continuous variable (after clamping to xlim if supplied), applies the same cut() breaks across implicates using scf_update_by_implicate(), and computes Rubin-pooled frequencies with scf_freq(). Results are filtered to remove bins with undefined proportions and then plotted using ggplot2::geom_col().

The logic here is specific to operations where the bin assignment must be computed within each implicate, not after pooling. This approach ensures consistent binning and stable pooled estimation in the presence of multiply-imputed microdata.

See Also

scf_freq(), scf_plot_dbar(), scf_plot_smooth(), scf_update_by_implicate()

Examples

Run this code
# Do not implement these lines in real analysis:
# Use functions `scf_download()` and `scf_load()`
td <- tempfile("plot_hist_")
dir.create(td)

src <- system.file("extdata", "scf2022_mock_raw.rds", package = "scf")
file.copy(src, file.path(td, "scf2022.rds"), overwrite = TRUE)
scf2022 <- scf_load(2022, data_directory = td)

# Example for real analysis: Plot histogram of age
scf_plot_hist(scf2022, ~age, bins = 10)

# Do not implement these lines in real analysis: Cleanup for package check
unlink(td, recursive = TRUE, force = TRUE)

Run the code above in your browser using DataLab