Learn R Programming

scf (version 1.0.5)

scf_corr: Estimate Correlation Between Two Continuous Variables in SCF Microdata

Description

This function estimates the linear association between two continuous variables using Pearson's correlation. Estimates are computed within each implicate and then pooled across implicates to account for imputation uncertainty.

Usage

scf_corr(scf, var1, var2)

Value

An object of class scf_corr, containing:

results

Data frame with pooled correlation estimate, standard error, t-statistic, degrees of freedom, p-value, and minimum/maximum values across implicates.

imps

Named vector of implicate-level correlations.

aux

Variable names used in the estimation.

Arguments

scf

An scf_mi_survey object, created by scf_load()

var1

One-sided formula specifying the first variable

var2

One-sided formula specifying the second variable

Implementation

  • Inputs: an scf_mi_survey object and two one-sided formulas (e.g., ~income)

  • Correlation computed using cor(..., use = "complete.obs") within each implicate

  • Rubin’s Rules applied to pool results across implicates

Interpretation

Pearson’s $r$ ranges from -1 to +1 and reflects the strength and direction of a linear bivariate association between two continuous variables. Values near 0 indicate weak linear association. Note that the operation is sensitive to outliers and does not capture nonlinear relationships nor adjust for covariates.

Statistical Notes

Correlation is computed within each implicate using complete cases. Rubin’s Rules are applied manually to pool estimates and calculate total variance. This function does not use scf_MIcombine(), which is intended for vector-valued estimates; direct pooling is more appropriate for scalar statistics like correlation coefficients.

Details

Computes the Pearson correlation coefficient between two continuous variables using multiply-imputed, replicate-weighted SCF data. Returns pooled estimates and standard errors using Rubin’s Rules.

See Also

scf_plot_hex(), scf_ols()

Examples

Run this code
# Ignore this code block.  It loads mock data for CRAN.
# In your analysis, download and load your data using the
# functions `scf_download()` and `scf_load()`
td <- tempfile("corr_")
dir.create(td)

src <- system.file("extdata", "scf2022_mock_raw.rds", package = "scf")
file.copy(src, file.path(td, "scf2022.rds"), overwrite = TRUE)
scf2022 <- scf_load(2022, data_directory = td)

# EXAMPLE IMPLEMENTATION OF `scf_corr()`:
corr <- scf_corr(scf2022, ~income, ~networth)
print(corr)
summary(corr)

# Ignore the code below.  It is for CRAN:
unlink(td, recursive = TRUE, force = TRUE)

Run the code above in your browser using DataLab