Learn R Programming

BayesSIM (version 1.0.0)

bsSphere: Bayesian Single-Index Regression with B-Spline Link and Half-Unit Sphere Prior

Description

Fits a single-index model \(Y_i \sim \mathcal{N}(f(X_i'\theta), \sigma^2), i = 1,\cdots,n\) where the link \(f(\cdot)\) is represented by B-spline link and the index vector \(\theta\) is on half-unit sphere.

Usage

bsSphere(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)

bsSphere_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )

Value

A list typically containing:

coefficients

Mean estimates of index vector. Return list of model_setup does not include it.

ses

Mean standard error of index vector. Return list of model_setup does not include it.

residuals

Residuals with mean estimates of fitted values. Return list of model_setup does not include it.

fitted.values

Mean estimates of fitted values. Return list of model_setup does not include it.

linear.predictors

Mean estimates of single-index values. Return list of model_setup does not include it.

gof

Goodness-of-fit. Return list of model_setup function does not include it.

samples

Posterior draws of variables for computing fitted values of the model, including \(\theta\), \(\sigma^2\). Return list of model_setup does not include it.

input

List of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.

defModel

Nimble model object.

defSampler

Nimble sampler object.

modelName

Name of the model.

Arguments

formula

an object of class formula. Interaction term is not allowed for single-index model.

data

an data frame.

prior

Optional named list of prior settings. Further descriptions are in Details section.

init

Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section.

monitors

Optional character vector of additional monitor nodes. To check the names of the nodes, fit the model_setup function and then inspect the variable names stored in the model object using getVarMonitor.

niter

Integer. Total MCMC iterations (default 10000).

nburnin

Integer. Burn-in iterations (default 1000).

thin

Integer. Thinning for monitors (default 1).

nchain

Integer. Number of MCMC chains (default 1).

setSeed

Logical or numeric argument. Further details are provided in runMCMC setSeed argument.

Details

Model The single–index model uses a \(m\)-order polynomial spline with \(k\) interior knots and intercept as follows: $$f(t) = \sum_{j=1}^{1+m+k} B_j(t)\,\beta_j$$ on \([a, b]\) with \(t_i = X_i' \theta, i = 1,\cdots, n\) and \(\|\theta\|_2 = 1\). \(\{\beta_j\}_{j=1}^{m+k+1}\) are spline coefficients and \(a_\theta\), \( b_\theta\) are boundary knots where \(a_{\theta} = min(t_i, i = 1, \cdots, n)\), and \(b_{\theta} = max(t_i, i = 1,\cdots, n)\). Variable selection is encoded by a binary vector \(\nu\), equivalently setting components of \(\theta\) to zero when \(\nu_i = 0\).

Priors

  • The variable selection indicator \(\nu\) has Beta–Bernoulli hierarchy prior. Beta hyper-prior on the Bernoulli–inclusion probability \(w\), inducing \(p(\nu) \propto \mathrm{Beta}(r_1 + n_\nu, r_2 + p - n_\nu)\) where \(n_\nu = \Sigma_{i=1}^{p}I(\nu_i = 1)\). \(r_1, r_2\) are shape and rate parameter of beta distribution.

  • Free‑knot prior: the number of knots \(k\) with mean \(\lambda_k\). The knot locations \(\xi_i, i = 1,...,k\) have a Dirichlet prior on the scaled interval \([0, 1]\).

  • Index vector prior is uniform on the half‑sphere of dimension \(n_\nu\) with first nonzero positive.

  • Conjugate normal–inverse-gamma on \((\beta, \sigma^2)\) enables analytic integration for RJMCMC with covariance \(\tau \Sigma_0\).

Sampling Posterior exploration follows a hybrid RJMCMC with six move types: add/remove predictor \(\nu\), update \(\theta\), add/delete/relocate a knot. The \(\theta\) update is a random‑walk Metropolis via local rotations in a two‑coordinate subspace. Knot changes use simple proposals with tractable acceptance ratios. Further sampling method is described in Wang (2009).

Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.

  1. Index vector: index_nu_r1, index_nu_r2 gives the shape and rate parameter of beta-binomial prior, respectively. (default index_nu_r1 = 1, index_nu_r2 = 1).

  2. Link function: B-spline knots, basis and coefficient setup.

    • knots: Free-knot prior for the spline. link_knots_lambda_k is the Poisson mean for the number of interior knot \(k\) (default 5). link_knots_maxknots is the maximum number of interior knots. If link_knots_maxknots is NULL, the number of interior knots is randomly drawn from a Poisson distribution.

    • basis: For the basis of B-spline, link_basis_degree is the spline degree (default 2).

    • beta: For the coefficient of B-spline, By default, link_beta_mu is a zero vector, link_beta_tau is set to the sample size, and link_beta_Sigma0 is the identity matrix of dimension \(1 + k + m\), where \(k\) is the number of interior knots and \(m\) is the spline order.

  3. Error variance (sigma2): Inverse gamma prior is assigned to \(\sigma^2\) where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution. Small values for shape and rate parameters yield a weakly-informative prior on \(\sigma^{2}\). (default sigma2_shape = 0.001, sigma2_rate = 0.001)

Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.

  1. Index vector: index_nu is binary vector indicating active predictors for the index. index is initial unit-norm index vector \(\theta\) (automatically normalized if necessary, with the first nonzero element made positive for identifiability). The vector length must match the number of predictor. Ideally, positions where index_nu has a value of 1 should correspond to nonzero elements in \(\theta\); elements corresponding to index_nu = 0 will be set to zero.

  2. Link function: link_k is initial number of interior knots. link_knots is initial vector of interior knot positions in \([0, 1]\), automatically scaled to the true boundary. Length of this vector should be equal to the initial value of k. link_beta is initial vector of spline coefficients. Length should be equal to the initial number of basis functions with intercept (\(1 + k + m\)).

  3. Error variance (sigma2): Initial scalar error variance. (default 0.01)

References

Wang, H.-B. (2009). Bayesian estimation and variable selection for single index models. Computational Statistics & Data Analysis, 53, 2617–2627.

Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.

Examples

Run this code
# \donttest{
set.seed(123)
n <- 200; d <- 4
theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2))
f <- function(u) u^2 * exp(u)
sigma <- 0.5
X <- matrix(runif(n * d, -1, 1), nrow = n)
index_vals <- as.vector(X %*% theta)
y <- f(index_vals) + rnorm(n, 0, sigma)
simdata <- data.frame(x = X, y = y)
colnames(simdata) <- c(paste0("X", 1:4), "y")

# One tool version
fit1 <- bsSphere(y ~ ., data = simdata,
                 niter = 5000, nburnin = 1000, nchain = 1)

# Split version
models <- bsSphere_setup(y ~ ., data = simdata)
Ccompile <- compileModelAndMCMC(models)
nimSampler <- get_sampler(Ccompile)
initList <- getInit(models)
mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1,
                   nchains = 1, setSeed = TRUE, inits = initList,
                   samplesAsCodaMCMC = TRUE)
fit2 <- as_bsim(models, mcmc.out)
summary(fit2)
# }

Run the code above in your browser using DataLab