Learn R Programming

BayesSIM (version 1.0.0)

bsFisher: Bayesian Single-Index Regression with B-Spline Link and von Mises-Fisher Prior

Description

Fits a single-index model \(Y_i \sim \mathcal{N}(f(X_i'\theta), \sigma^2), i = 1,\cdots,n\) where the link \(f(\cdot)\) is represented by B-spline and the index vector \(\theta\) has von Mises–Fisher prior.

Usage

bsFisher(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)

bsFisher_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )

Value

A list typically containing:

coefficients

Mean estimates of index vector. Return list of model_setup does not include it.

ses

Mean standard error of index vector. Return list of model_setup does not include it.

residuals

Residuals with mean estimates of fitted values. Return list of model_setup does not include it.

fitted.values

Mean estimates of fitted values. Return list of model_setup does not include it.

linear.predictors

Mean estimates of single-index values. Return list of model_setup does not include it.

gof

Goodness-of-fit. Return list of model_setup function does not include it.

samples

Posterior draws of variables for computing fitted values of the model, including \(\theta\), \(\sigma^2\). Return list of model_setup does not include it.

input

List of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.

defModel

Nimble model object.

defSampler

Nimble sampler object.

modelName

Name of the model.

Arguments

formula

an object of class formula. Interaction term is not allowed for single-index model.

data

an data frame.

prior

Optional named list of prior settings. Further descriptions are in Details section.

init

Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section.

monitors

Optional character vector of additional monitor nodes. To check the names of the nodes, fit the model_setup function and then inspect the variable names stored in the model object using getVarMonitor.

niter

Integer. Total MCMC iterations (default 10000).

nburnin

Integer. Burn-in iterations (default 1000).

thin

Integer. Thinning for monitors (default 1).

nchain

Integer. Number of MCMC chains (default 1).

setSeed

Logical or numeric argument. Further details are provided in runMCMC setSeed argument.

Details

Model The single–index model uses a \(m\)-order polynomial spline with \(k\) interior knots as follows: $$f(t) = \sum_{j=1}^{m+k} B_j(t)\,\beta_j$$ on \([a, b]\) with \(t_i = X_i' \theta, i = 1,\cdots, n\) and \(\|\theta\|_2 = 1\). \(\{\beta_j\}_{j=1}^{m+k}\) are spline coefficients and \(a_\theta\), \(b_\theta\) are boundary knots where \(a_{\theta} = min(t_i, i = 1, \cdots, n) - \delta\), and \(b_{\theta} = max(t_i, i = 1,\cdots, n) + \delta\).

Priors

  • von Mises–Fisher prior on the index \(\theta\) with direction and concentration.

  • Conditioned on \(\theta\) and \(\sigma^2\), the link coefficients \(\beta = (\beta_1,\ldots,\beta_{m+k})^\top\) follow normal distribution with estimated mean vector \(\hat{\beta}_{\theta} = (X_{\theta}'X_{\theta})^{-1}X_{\theta}'Y\) and covariance \(\sigma^2 (X_{\theta}^\top X_{\theta})^{-1}\), where \(X_{\theta}\) is the B-spline basis design matrix.

  • Inverse gamma prior on \(\sigma^2\) with shape parameter \(a_\sigma\) and rate parameter \(b_\sigma\).

Sampling Random walk metropolis algorithm is used for index vector \(\theta\). Given \(\theta\), \(\sigma^2\) and \(\beta\) are sampled from posterior distribution. Further sampling method is described in Antoniadis et al(2004).

Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.

  1. Index vector: von Mises--Fisher prior for the projection vector \(\theta\) (index). index_direction is a unit direction vector of the von Mises--Fisher distribution. By default, the estimated vector from projection pursuit regression is assigned. index_dispersion is the positive concentration parameter. By default, 150 is assigned.

  2. Link function: B-spline basis and coefficient of B-spline setup.

    • basis: For the basis of B-spline, link_basis_df is the number of basis functions (default 21), link_basis_degree is the spline degree (default 2) and link_basis_delta is a small jitter for boundary knots spacing control (default 0.001).

    • beta: For the coefficient of B-spline, multivariate normal prior is assigned with mean link_beta_mu, and covariance link_beta_cov. By default, \(\mathcal{N}_p\!\big(0, \mathrm{I}_p\big)\) is assigned.

  3. Error variance (sigma2): An Inverse gamma prior is assigned to \(\sigma^2\) where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution. (default sigma2_shape = 0.001, sigma2_rate = 100)

Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.

  1. Index vector: Initial unit index vector \(\theta\). By default, the vector is randomly sampled from the von Mises--Fisher prior.

  2. Link function: Initial spline coefficients (link_beta). By default, \(\big(X_{\theta}^\top X_{\theta} + \rho\, \mathrm{I}\big)^{-1} X_{\theta}^\top Y\) is computed, where \(X_{\theta}\) is the B-spline basis design matrix.

  3. Error variance (sigma2): Initial scalar error variance (default 0.01).

References

Antoniadis, A., Grégoire, G., & McKeague, I. W. (2004). Bayesian estimation in single-index models. Statistica Sinica, 1147-1164.

Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.

Examples

Run this code
# \donttest{
set.seed(123)
n <- 200; d <- 4
theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2))
f <- function(u) u^2 * exp(u)
sigma <- 0.5
X <- matrix(runif(n * d, -1, 1), nrow = n)
index_vals <- as.vector(X %*% theta)
y <- f(index_vals) + rnorm(n, 0, sigma)
simdata <- data.frame(x = X, y = y)
colnames(simdata) <- c(paste0("X", 1:4), "y")

# One tool version
fit1 <- bsFisher(y ~ ., data = simdata,
                 niter = 5000, nburnin = 1000, nchain = 1)

# Split version
models <- bsFisher_setup(y ~ ., data = simdata)
Ccompile <- compileModelAndMCMC(models)
nimSampler <- get_sampler(Ccompile)
initList <- getInit(models)
mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1,
                   nchains = 1, setSeed = TRUE, inits = initList,
                   summary = TRUE, samplesAsCodaMCMC = TRUE)
fit2 <- as_bsim(models, mcmc.out)
summary(fit2)


# }

Run the code above in your browser using DataLab