spatialAtomizeR
Bayesian Spatial Regression with Misaligned Data
Overview
spatialAtomizeR implements Bayesian atom-based regression methods (ABRM) for assessing associations between spatially-misaligned variables, i.e., variables measured over two distinct and non-nested sets of spatial areas. The ABRM approach does not require any a priori re-alignment of the variables. This package uses Nimble under the hood for flexible and efficient Bayesian implementation. The package handles situations where:
- Outcome data and some covariates are measured on one spatial scale (called the "Y-grid"), while the remaining covariates are measured on a different spatial scale (called the "X-grid")
- The areas comprising the two spatial scales are misaligned, i.e., have mismatched boundaries, and neither scale is fully nested within the other
- Variables follow different distributions (normal, Poisson, binomial)
Installation
You can install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("bellayqian/spatialAtomizeR")Quick Start
Basic Workflow
library(spatialAtomizeR)
library(nimble) # Required for ABRM models
# 1. Simulate misaligned spatial data with full parameter specification
sim_data <- simulate_misaligned_data(
seed = 42,
dist_covariates_x = c('normal', 'poisson', 'binomial'),
dist_covariates_y = c('normal', 'poisson', 'binomial'),
dist_y = 'poisson',
x_intercepts = c(4, -1, -1), # Intercepts for X covariates
y_intercepts = c(4, -1, -1), # Intercepts for Y covariates
x_correlation = 0.5, # Spatial correlation for X
y_correlation = 0.5, # Spatial correlation for Y
beta0_y = -1, # Outcome intercept
beta_x = c(-0.03, 0.1, -0.2), # Coefficients for X covariates
beta_y = c(0.03, -0.1, 0.2) # Coefficients for Y covariates
)
# 2. Get NIMBLE model code
model_code <- get_abrm_model()
# 3. Run ABRM analysis
results <- run_abrm(
gridx = sim_data$gridx,
gridy = sim_data$gridy,
atoms = sim_data$atoms,
model_code = model_code,
true_params = sim_data$true_params, # optional vector of true outcome model coefficient parameters
norm_idx_x = 1, # Index of normal-distributed X covariate
pois_idx_x = 2, # Index of Poisson-distributed X covariate
binom_idx_x = 3, # Index of binomial-distributed X covariate
norm_idx_y = 1, # Index of normal-distributed Y covariate
pois_idx_y = 2, # Index of Poisson-distributed Y covariate
binom_idx_y = 3, # Index of binomial-distributed Y covariate
dist_y = 2, # Outcome distribution: 1=normal, 2=poisson, 3=binomial
niter = 50000, # MCMC iterations
nburnin = 30000, # Burn-in iterations
nchains = 2 # Number of chains
)
# 4. View results
print(results$parameter_estimates)Main Features
Data Simulation
- Create two spatial grids ("X-grid" and "Y-grid") with non-nested spatial misalignment
- Generate synthetic spatially correlated variables with customizable distributions over each spatial grid
- Specify true parameter values for validation
Model Fitting
- Atom-based Bayesian regression with NIMBLE
- Support for mixed-type variables (normal, Poisson, binomial)
- Multivariate CAR models to allow for information-sharing over space and across variables
- Automatic convergence diagnostics
Method Comparison
- Compare ABRM with dasymetric mapping
- Calculate bias, RMSE, and coverage rates
- Generate comparison plots
Sensitivity Analysis
- Test across different correlation structures
- Multiple simulations per setting
- Automated result summarization
S3 Object System
All main functions return S3 objects with dedicated print, summary, and plot methods:
# Create simulated data
sim_data <- simulate_misaligned_data(...)
class(sim_data) # "misaligned_data"
# View results with clean formatting
print(sim_data) # Clean overview
summary(sim_data) # Detailed information
# Run ABRM analysis
results <- run_abrm(...)
class(results) # "abrm"
print(results) # Shows parameter count, bias, coverage
summary(results) # Shows full parameter table
plot(results) # Shows MCMC diagnostic plots
# Compare methods
comparison <- run_both_methods(...)
class(comparison) # "abrm_comparison"
print(comparison) # Shows method comparison summary
summary(comparison) # Shows detailed metrics by methodS3 Methods Examples
The package provides intuitive S3 methods for all major output types:
# Simulated data
sim_data <- simulate_misaligned_data(seed = 123, ...)
print(sim_data)
# Output:
# Simulated Misaligned Spatial Data
# ==================================
# Y-grid cells: 25
# X-grid cells: 100
# Atoms: 200
# ...
# ABRM results
results <- run_abrm(...)
print(results)
# Output:
# ABRM Model Results
# ==================
# Number of parameters estimated: 6
# Mean absolute bias: 0.0234
# Coverage rate: 95.00%
# Use summary() for detailed parameter estimates
summary(results) # Shows full parameter tableKey Functions
| Function | Description |
|---|---|
simulate_misaligned_data() | Generate simulated spatial data with full parameter control |
get_abrm_model() | Get NIMBLE model specification |
run_abrm() | Run ABRM analysis (wrapper function) |
run_nimble_model() | Run NIMBLE MCMC with diagnostics |
run_both_methods() | Compare ABRM and dasymetric mapping |
run_sensitivity_analysis() | Conduct sensitivity analysis |
prepare_spatial_bookkeeping() | Prepare spatial indices |
prepare_adjacency_matrices() | Create spatial adjacency structures |
prepare_nimble_inputs() | Prepare NIMBLE model inputs |
Data Simulation Parameters
The simulate_misaligned_data() function accepts the following parameters:
Reproducibility Parameters:
seed: Random seed for reproducibility
Covariate Distributions:
dist_covariates_x: Vector of distribution types for X-grid covariates (e.g.,c('normal', 'poisson', 'binomial'))dist_covariates_y: Vector of distribution types for Y-grid covariatesdist_y: Distribution type for outcome variable ('normal','poisson', or'binomial')
Data Generation Parameters:
x_intercepts: Intercepts for X-grid covariates (length must matchdist_covariates_x)y_intercepts: Intercepts for Y-grid covariates (length must matchdist_covariates_y)beta0_y: Intercept for the outcome modelbeta_x: True coefficients for X-grid covariates in outcome modelbeta_y: True coefficients for Y-grid covariates in outcome model
Between-Variable Correlation:
x_correlation: Correlation between X-grid covariates (0 to 1)y_correlation: Correlation between Y-grid covariates (0 to 1)
Distribution Type Indices
When running ABRM models, you need to specify which covariates follow which distributions:
norm_idx_x,norm_idx_y: Indices of normally-distributed covariatespois_idx_x,pois_idx_y: Indices of Poisson-distributed covariatesbinom_idx_x,binom_idx_y: Indices of binomially-distributed covariatesdist_y: Outcome distribution type (1=normal, 2=poisson, 3=binomial)
Example: If dist_covariates_x = c('normal', 'poisson', 'binomial'), then:
norm_idx_x = 1(first covariate)pois_idx_x = 2(second covariate)binom_idx_x = 3(third covariate)
Example: Comprehensive Sensitivity Analysis
library(spatialAtomizeR)
library(nimble)
# Define base parameters
base_params <- list(
dist_covariates_x = c('normal','poisson','binomial'),
dist_covariates_y = c('normal','poisson','binomial'),
dist_y = 'poisson',
x_intercepts = c(4, -1, -1),
y_intercepts = c(4, -1, -1),
beta0_y = -1,
beta_x = c(-0.03, 0.1, -0.2),
beta_y = c(0.03, -0.1, 0.2)
)
# Get model code
model_code <- get_abrm_model()
# Run sensitivity analysis across correlation structures
sensitivity_results <- run_sensitivity_analysis(
correlation_grid = c(0.2, 0.6),
n_sims_per_setting = 3,
base_params = base_params,
model_code = model_code,
base_seed = 123
)
# View summary by correlation
print(sensitivity_results$summary_by_correlation)
# Access detailed results
write.csv(
sensitivity_results$combined_results,
"sensitivity_analysis_full_results.csv"
)Requirements
- R >= 4.0.0
- nimble for MCMC sampling (must be loaded)
- Spatial packages: sp, sf, spdep, raster
- BiasedUrn for multivariate hypergeometric sampling
- MASS for multivariate normal generation
- dplyr, tidyr for data manipulation
Funding and Project Information
This work was funded by the Robert Wood Johnson Foundation, Grant 81746. Project details are provided below.
Project Title: Aligning spatially misaligned data for health equity analysis, action, and accountability
Principal Investigators: Dr. Nancy Krieger (PI) and Dr. Rachel Nethery (co-PI)
Start Date: July 2024
Project Team and Collaborators:
- Yunzhe Qian (Bella), MS (Research Assistant, Dept of Biostatistics, HSPH)
- Rachel Nethery, PhD (Associate Professor, Dept of Biostatistics, HSPH)
- Nancy Krieger, PhD (Professor, Department of Social and Behavioral Sciences (SBS), HSPH)
- Nykesha Johnson, MPH (Statistical Data Analyst/Data Manager, SBS, HSPH)
Citation
If you use this package, please cite:
Qian Y, Nethery R, Krieger N, Johnson N (2025). spatialAtomizeR: Spatial Analysis with Misaligned Data Using Atom-Based Regression Models. R package version 0.2.4, https://github.com/bellayqian/spatialAtomizeR.
About
This work is an extension of:
Nethery, R. C., Testa, C., Tabb, L. P., Hanage, W. P., Chen, J. T., & Krieger, N. (2023). Addressing spatial misalignment in population health research: a case study of US congressional district political metrics and county health data. MedRxiv.
Spatial misalignment—which occurs when data on multiple variables are collected using mismatched geographic boundary definitions—is a longstanding challenge in public health research. For instance, congressional districts can cut across multiple counties, and environmental hazard zones may cross census tract boundaries, in both cases creating intersecting areas that complicate efforts to study the relationships between health outcomes and their social, political, and environmental determinants.
Atom-based regression models (ABRM) offer a promising alternative by using atoms—the intersecting areas of all relevant units—as the fundamental units of analysis. By preserving the original spatial resolution of the data, ABRM account for uncertainty in statistical relationships while offering a robust method for handling misaligned data.
Getting Help
- Report bugs: https://github.com/bellayqian/spatialAtomizeR/issues
- Documentation:
?spatialAtomizeR - Vignette:
vignette("getting-started", package = "spatialAtomizeR") - Function help:
?simulate_misaligned_data,?run_abrm, etc.
License
MIT License