mo.bqr.svy: Multiple-Output Bayesian quantile regression for complex survey data

Description

mo.bqr.svy implements a Bayesian approach to multiple-output quantile regression for complex survey data analysis. The method builds a quantile region based on a directional approach. To improve computational efficiency, an Expectation-Maximization (EM) algorithm is implemented instead of the usual Markov Chain Monte Carlo (MCMC).

Usage

mo.bqr.svy(
  formula,
  weights = NULL,
  data = NULL,
  quantile = 0.5,
  prior = NULL,
  U = NULL,
  gamma_U = NULL,
  n_dir = NULL,
  epsilon = 1e-06,
  max_iter = 1000,
  verbose = FALSE,
  estimate_sigma = FALSE
)

Value

An object of class "mo.bqr.svy" containing:

call: The matched call
formula: The model formula
terms: The terms object
quantile: Vector of fitted quantiles
prior: List of priors used for each quantile
fit: List of fitted results for each quantile, each containing one sub-list per direction
coefficients: Coefficients organized by quantile
sigma: List of scale parameters by quantile and direction. If estimate_sigma = FALSE, all entries are fixed at 1. If estimate_sigma = TRUE, each entry contains the estimated value of \(\sigma\) (posterior mode from EM).
n_dir: Number of directions
U: Matrix of projection directions (\(d \times K\))
Gamma_list: List of orthogonal complement bases, one per direction
n_obs: Number of observations
n_vars: Number of covariates
response_dim: Dimension of the response \(d\)
estimate_sigma: Logical flag indicating whether the scale parameter \(\sigma^2\) was estimated (TRUE) or fixed at 1 (FALSE).

Arguments

formula: a symbolic description of the model to be fit.
weights: an optional numerical vector containing the survey weights. If NULL, equal weights are used.
data: an optional data frame containing the variables in the model.
quantile: numerical scalar or vector containing quantile(s) of interest (default=0.5).
prior: a bqr_prior object of class "prior". If omitted, a vague prior is assumed (see prior).
U: an optional \(d \times K\)-matrix of directions, where \(d\) indicates the response variable dimension and \(K\) indicates indicates the number of directions.
gamma_U: an optional list with length equal to \(K\) for which each element corresponds to \(d \times (d-1)\)-matrix of ortoghonal basis for each row of U.
n_dir: numerical scalar corresponding to the number of directions (if U and gamma_U are not supplied).
epsilon: numerical scalar indicating the convergence tolerance for the EM algorithm (default = 1e-6).
max_iter: numerical scalar indicating maximum number of EM iterations (default = 1000).
verbose: logical flag indicating whether to print progress messages (default=FALSE).
estimate_sigma: logical flag indicating whether to estimate the scale parameter when method = "ald" (default=FALSE and \(\sigma^2\) is set to 1)

References

Nascimento, M. L. & Gonçalves, K. C. M. (2024). Bayesian Quantile Regression Models for Complex Survey Data Under Informative Sampling. Journal of Survey Statistics and Methodology, 12(4), 1105–1130. doi:10.1093/jssam/smae015

Examples

Run this code

# \donttest{
library(MASS)

# Generate population data
set.seed(123)
N    <- 10000
data <- mvrnorm(N, rep(0, 3),
                matrix(c(4, 0, 2,
                         0, 1, 1.5,
                         2, 1.5, 9), 3, 3))
x_p  <- as.matrix(data[, 1])
y_p  <- data[, 2:3] + cbind(rep(0, N), x_p)

# Generate sample data
n <- 500
z_aux <- rnorm(N, mean = 1 + y_p, sd = 0.5)
p_aux <- 1 / (1 + exp(2.5 - 0.5 * z_aux))
s_ind <- sample(1:N, n, replace = FALSE, prob = p_aux)
y_s   <- y_p[s_ind, ]
x_s   <- x_p[s_ind, ]
w     <- 1 / p_aux[s_ind]
data_s <- data.frame(y1 = y_s[, 1],
                     y2 = y_s[, 2],
                     x1 = x_s,
                     w  = w)

# Basic usage with default priors when U and gamma_U are given
fit1 <- mo.bqr.svy(
  cbind(y1, y2) ~ x1,
  weights = w,
  data = data_s,
  quantile = c(0.1, 0.2),
  U = matrix(c(0, 1, 1/sqrt(2), 1/sqrt(2)), 2),
  gamma_U = list(c(1, 0), c(1/sqrt(2), -1/sqrt(2)))
)

# Basic usage with default priors when n_dir is given
fit2 <- mo.bqr.svy(
  cbind(y1, y2) ~ x1,
  weights = w,
  data = data_s,
  quantile = c(0.1, 0.2),
  n_dir = 2
)
# }

Run the code above in your browser using DataLab