Learn R Programming

SignalY (version 1.1.1)

fit_horseshoe: Fit Regularized Horseshoe Regression Model

Description

Fits a Bayesian linear regression with regularized Horseshoe prior using Stan via cmdstanr. This version includes improved numerical stability and automatic prior calibration.

Usage

fit_horseshoe(
  y,
  X,
  var_names = NULL,
  p0 = NULL,
  slab_scale = 3,
  slab_df = 4,
  tau_scale = NULL,
  use_qr = FALSE,
  standardize = TRUE,
  X_new = NULL,
  iter_warmup = 1000,
  iter_sampling = 1000,
  chains = 4,
  adapt_delta = 0.95,
  max_treedepth = 12,
  seed = 123,
  verbose = TRUE
)

Value

A list of class "signaly_horseshoe" with posterior summaries, diagnostics, and model fit object.

Arguments

y

Numeric vector of the response variable.

X

Matrix or data frame of predictor variables.

var_names

Optional character vector of variable names.

p0

Expected number of non-zero coefficients. Default: P/3.

slab_scale

Scale for the regularizing slab. Default: 3.

slab_df

Degrees of freedom for the slab. Default: 4.

tau_scale

Scale multiplier for the global shrinkage prior. Default: NULL (auto-calibrated based on data characteristics). Increase this value (e.g., 10-20) if the model over-shrinks.

use_qr

Use QR decomposition? Default: FALSE.

standardize

Standardize predictors internally? Default: TRUE.

X_new

Optional matrix for out-of-sample prediction.

iter_warmup

Warmup iterations per chain. Default: 1000.

iter_sampling

Sampling iterations per chain. Default: 1000.

chains

Number of MCMC chains. Default: 4.

adapt_delta

Target acceptance probability. Default: 0.95.

max_treedepth

Maximum tree depth. Default: 12.

seed

Random seed.

verbose

Print progress messages?

Details

The regularized Horseshoe prior (Piironen & Vehtari, 2017) provides adaptive shrinkage that can distinguish between relevant and irrelevant predictors.

Variable Selection Methods:

After fitting, variables can be selected using different criteria:

  • select_by_credible_interval: Selects variables whose credible interval excludes zero. Recommended - most robust method.

  • select_by_shrinkage: Selects based on kappa (shrinkage factor). May underselect when tau is very small.

  • select_by_magnitude: Selects based on coefficient magnitude.

Note on kappa-based selection:

The shrinkage factor kappa depends on the global shrinkage parameter tau. In some datasets, the posterior of tau may concentrate near zero, causing all kappa values to be close to 1 even for truly relevant variables. When this happens, the coefficient estimates (beta) remain valid, but kappa-based selection will fail. The function automatically warns when this occurs and recommends using select_by_credible_interval() instead.