Learn R Programming

SeBR (version 1.1.0)

blm_bc_hs: Bayesian linear model with a Box-Cox transformation and a horseshoe prior

Description

MCMC sampling for Bayesian linear regression with 1) a (known or unknown) Box-Cox transformation and 2) a horseshoe prior for the (possibly high-dimensional) regression coefficients.

Usage

blm_bc_hs(
  y,
  X,
  X_test = X,
  lambda = NULL,
  sample_lambda = TRUE,
  only_theta = FALSE,
  nsave = 1000,
  nburn = 1000,
  nskip = 0,
  verbose = TRUE
)

Value

a list with the following elements:

  • coefficients the posterior mean of the regression coefficients

  • fitted.values the posterior predictive mean at the test points X_test

  • post_theta: nsave x p samples from the posterior distribution of the regression coefficients

  • post_ypred: nsave x n_test samples from the posterior predictive distribution at test points X_test

  • post_g: nsave posterior samples of the transformation evaluated at the unique y values

  • post_lambda: nsave posterior samples of lambda

  • post_sigma: nsave posterior samples of sigma

  • model: the model fit (here, blm_bc_hs)

as well as the arguments passed in.

Arguments

y

n x 1 vector of observed counts

X

n x p matrix of predictors (no intercept)

X_test

n_test x p matrix of predictors for test data; default is the observed covariates X

lambda

Box-Cox transformation; if NULL, estimate this parameter

sample_lambda

logical; if TRUE, sample lambda, otherwise use the fixed value of lambda above or the MLE (if lambda unspecified)

only_theta

logical; if TRUE, only return posterior draws of the regression coefficients (for speed)

nsave

number of MCMC iterations to save

nburn

number of MCMC iterations to discard

nskip

number of MCMC iterations to skip between saving iterations, i.e., save every (nskip + 1)th draw

verbose

logical; if TRUE, print time remaining

Details

This function provides fully Bayesian inference for a transformed linear model via MCMC sampling. The transformation is parametric from the Box-Cox family, which has one parameter lambda. That parameter may be fixed in advanced or learned from the data.

The horseshoe prior is especially useful for high-dimensional settings with many (possibly correlated) covariates. This function uses a fast Cholesky-forward/backward sampler when p < n and the Bhattacharya et al. (<https://doi.org/10.1093/biomet/asw042>) sampler when p > n. Thus, the sampler can scale linear in n (for fixed/small p) or linear in p (for fixed/small n).

Examples

Run this code
# Simulate data from a transformed (sparse) linear model:
dat = simulate_tlm(n = 100, p = 50, g_type = 'step', prop_sig = 0.1)
y = dat$y; X = dat$X # training data

hist(y, breaks = 25) # marginal distribution

# Fit the Bayesian linear model with a Box-Cox transformation & a horseshoe prior:
fit = blm_bc_hs(y = y, X = X, verbose = FALSE)
names(fit) # what is returned

Run the code above in your browser using DataLab