bqr.svy: Bayesian quantile regression for complex survey data

Description

bqr.svy implements Bayesian methods for estimating quantile regression models for complex survey data analysis regarding single (univariate) outputs. To improve computational efficiency, the Markov Chain Monte Carlo (MCMC) algorithms are implemented in 'C++'.

Usage

bqr.svy(
  formula,
  weights = NULL,
  data = NULL,
  quantile = 0.5,
  method = c("ald", "score", "approximate"),
  prior = NULL,
  niter = 50000,
  burnin = 0,
  thin = 1,
  verbose = TRUE,
  estimate_sigma = FALSE
)

Value

An object of class "bqr.svy", containing:

beta: Posterior mean estimates of regression coefficients.
draws: Posterior draws from the MCMC sampler.
accept_rate: Average acceptance rate (if available).
warmup, thin: MCMC control parameters used during sampling.
quantile: The quantile(s) fitted.
prior: Prior specification used.
formula, terms, model: Model specification details.
runtime: Elapsed runtime in seconds.
method: Estimation method
estimate_sigma: Logical flag indicating whether the scale parameter \(\sigma^2\) was estimated (TRUE) or fixed at 1 (FALSE).

Arguments

formula: a symbolic description of the model to be fit.
weights: an optional numerical vector containing the survey weights. If NULL, equal weights are used.
data: an optional data frame containing the variables in the model.
quantile: numerical scalar or vector containing quantile(s) of interest (default=0.5).
method: one of "ald", "score" and "approximate" (default="ald").
prior: a bqr_prior object of class "prior". If omitted, a vague prior is assumed (see prior).
niter: number of MCMC draws.
burnin: number of initial MCMC draws to be discarded.(default = 0)
thin: thinning parameter, i.e., keep every keepth draw (default=1).
verbose: logical flag indicating whether to print progress messages (default=TRUE).
estimate_sigma: logical flag indicating whether to estimate the scale parameter when method = "ald" (default=FALSE and \(\sigma^2\) is set to 1)

Details

The bqr.svy function can estimate three types of models, where the quantile regression coefficients are defined at the super-population level, and their estimators are built upon the survey weights.

"ald" – The asymmetric Laplace distribution as working likelihood.
"score" – A score based likelihood function.
"approximate" – A pseudolikelihood function based on a Gaussian approximation.

References

Nascimento, M. L. & Gonçalves, K. C. M. (2024). Bayesian Quantile Regression Models for Complex Survey Data Under Informative Sampling. Journal of Survey Statistics and Methodology, 12(4), 1105–1130. doi:10.1093/jssam/smae015

Examples

Run this code

# \donttest{
# Generate population data
set.seed(123)
N    <- 10000
x1_p <- runif(N, -1, 1)
x2_p <- runif(N, -1, 1)
y_p  <- 2 + 1.5 * x1_p - 0.8 * x2_p + rnorm(N)

# Generate sample data
n <- 500
z_aux <- rnorm(N, mean = 1 + y_p, sd = .5)
p_aux <- 1 / (1 + exp(2.5 - 0.5 * z_aux))
s_ind <- sample(1:N, n, replace = FALSE, prob = p_aux)
y_s   <- y_p[s_ind]
x1_s  <- x1_p[s_ind]
x2_s  <- x2_p[s_ind]
w     <- 1 / p_aux[s_ind]
data  <- data.frame(y = y_s, x1 = x1_s, x2 = x2_s, w = w)

# Basic usage with default method ('ald') and priors (vague)
fit1 <- bqr.svy(y ~ x1 + x2, weights = w, data = data)

# Specify informative priors
prior <- prior(
  beta_x_mean = c(2, 1.5, -0.8),
  beta_x_cov  = diag(c(0.25, 0.25, 0.25)),
  sigma_shape = 1,
  sigma_rate  = 1
)
fit2 <- bqr.svy(y ~ x1 + x2, weights = w, data = data, prior = prior)

# Specify different methods
fit_score  <- bqr.svy(y ~ x1 + x2, weights = w, data = data, method = "score")
fit_approx <- bqr.svy(y ~ x1 + x2, weights = w, data = data, method = "approximate")
# }

Run the code above in your browser using DataLab