Learn R Programming

BayesSIM (version 1.0.0)

gpPolar: Bayesian Single-Index Regression with Gaussian Process Link and One-to-One Polar Transformation

Description

Fits a single–index model \(Y_i \sim \mathcal{N}(f(X_i'\theta), \sigma^2), i = 1,\cdots,n\), where the index \(\theta\) is specified and computed via a one-to-one polar transformation, and the link \(f(\cdot)\) is represented with a Gaussian process.

Usage

gpPolar(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)

gpPolar_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )

gpPolarHigh( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )

gpPolarHigh_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )

Value

A list typically containing:

coefficients

Mean estimates of index vector. Return list of model_setup does not include it.

ses

Mean standard error of index vector. Return list of model_setup does not include it.

residuals

Residuals with mean estimates of fitted values. Return list of model_setup does not include it.

fitted.values

Mean estimates of fitted values. Return list of model_setup does not include it.

linear.predictors

Mean estimates of single-index values. Return list of model_setup does not include it.

gof

Goodness-of-fit. Return list of model_setup function does not include it.

samples

Posterior draws of variables for computing fitted values of the model, including \(\theta\), \(\sigma^2\). Return list of model_setup does not include it.

input

List of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.

defModel

Nimble model object.

defSampler

Nimble sampler object.

modelName

Name of the model.

Arguments

formula

an object of class formula. Interaction term is not allowed for single-index model.

data

an data frame.

prior

Optional named list of prior settings. Further descriptions are in Details section.

init

Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section.

monitors

Optional character vector of additional monitor nodes. To check the names of the nodes, fit the model_setup function and then inspect the variable names stored in the model object using getVarMonitor.

niter

Integer. Total MCMC iterations (default 10000).

nburnin

Integer. Burn-in iterations (default 1000).

thin

Integer. Thinning for monitors (default 1).

nchain

Integer. Number of MCMC chains (default 1).

setSeed

Logical or numeric argument. Further details are provided in runMCMC setSeed argument.

Details

Model The single–index model is specified as \(Y_i = f(X_i'{\theta}) + \epsilon_i\), where the index vector \(\theta\) lies on the unit sphere with (\(\|\theta\|_2=1\)) with non-zero first component to ensure identifiability and is parameterized via a one-to-one polar transformation with angle \(\psi_1,...,\psi_{p-1}\).

The mapping is $$ \begin{aligned} \theta_1 &= \sin(\psi_1),\\ \theta_i &= \Big(\prod_{j=1}^{i-1}\cos(\psi_j)\Big)\sin(\psi_i), \quad i=2,\dots,p-1,\\ \theta_p &= \prod_{j=1}^{p-1}\cos(\psi_j). \end{aligned} $$ The vector is then scaled to unit length.

Sampling is performed on the angular parameters \(\theta\) defining the index vector. The link function \(f(\cdot)\) is modeled by a Gaussian process prior with zero mean and an Ornstein–Uhlenbeck (OU) covariance kernel \(\exp(-\kappa \cdot |t_i - t_j|), i, j = 1,\ldots, n\), where \(\kappa\) is a bandwidth (smoothness) parameter and \(t_i, t_j\) is index value (\(t_i = X_i'\theta\)).

Priors

  • \(\psi\) is \(p-1\) dimension of polar angle of index vector and re-scaled Beta distribution on \([0, \pi]\) is assigned for prior.

  • Prior for \(\kappa\) (bandwidth parameter) is discrete uniform of equally spaced grid points in \([\kappa_{\text{min}}, \kappa_{\text{max}}\)].

  • Inverse gamma prior on \(\sigma^2\) with shape parameter \(a_\sigma\) and rate parameter \(b_\sigma\).

Sampling For gpPolar, \(\theta\) is sampled by Metropolis-Hastings and updated with \(f\), \(\kappa\) is chosen by grid search method that maximizes likelihood, \(\sigma^2\) are sampled from their full conditional distributions using Gibbs sampling. Since \(\kappa\) is sampled by grid search, more than 5 dimension of variables gpPolarHigh is recommended. For gpPolarHigh, all sampling parameters' samplers are assigned by nimble.

Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.

  1. Index vector: Only shape parameter index_psi_alpha of \(p-1\) dimension vector is needed since rate parameters is computed to satisfy \(\theta_{j, \text{MAP}}\). By default, the shape parameter for each element of the polar vector is set to 5000.

  2. Link function: link_kappa_min is minimum value of kappa (default 0.5), link_kappa_max is maximum value of kappa (default 4), and link_kappa_grid_width is space (default 0.1).

  3. Error variance (sigma2): An Inverse gamma prior is assigned to \(\sigma^2\) where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution. (default sigma2_shape = 2, sigma2_rate = 0.01)

Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.

  1. Index vector: Initial vector of polar angle index_psi with \(p-1\) dimension. Each element of angle is between 0 and \(\pi\).

  2. Link function: Initial scalar scale parameter of covariance kernel link_kappa. (default: 2)

  3. Error variance (sigma2): Initial scalar error variance. (default: 0.01)

References

Dhara, K., Lipsitz, S., Pati, D., & Sinha, D. (2019). A new Bayesian single index model with or without covariates missing at random. Bayesian analysis, 15(3), 759.

Examples

Run this code
# \donttest{
library(MASS)
N <- 100    # Sample Size
p <- 3
mu <- c(0,0,0)
rho <- 0
Cx <- rbind(c(1,rho,rho), c(rho,1,rho), c(rho, rho,1))
X <- mvrnorm(n = N, mu=mu, Sigma=Cx, tol=1e-8)
alpha <- c(1,1,1)
alpha <- alpha/sqrt(sum(alpha^2))
z <- matrix(0,N)
z <- X %*% alpha
z <- z[,1]
sigma <- 0.3
f <- exp(z)
y <- f + rnorm(N, 0, sd=sigma) # adding noise
y <- y-mean(y)
f_all <- f
x_all <- X
y_all <- y
simdata <- cbind(x_all, y, f)
simdata <- as.data.frame(simdata)
colnames(simdata) = c('x1', 'x2', 'x3', 'y','f')

# One tool version
fit1 <- gpPolar(y ~ x1 + x2 + x3, data = simdata,
                niter = 5000, nburnin = 1000, nchain = 1)
fit2 <- gpPolarHigh(y ~ x1 + x2 + x3, data = simdata,
                    niter = 5000, nburnin = 1000, nchain = 1)

# Split version
models1 <- gpPolar_setup(y ~ x1 + x2 + x3, data = simdata)
models2 <- gpPolarHigh_setup(y ~ x1 + x2 + x3, data = simdata)
Ccompile1 <- compileModelAndMCMC(models1)
Ccompile2 <- compileModelAndMCMC(models2)
sampler1 <- get_sampler(Ccompile1)
sampler2 <- get_sampler(Ccompile2)
initList1 <- getInit(models1)
initList2 <- getInit(models2)
mcmc.out1 <- runMCMC(sampler1, niter = 5000, nburnin = 1000, thin = 1,
                    nchains = 1, setSeed = TRUE, init = initList1,
                    summary = TRUE, samplesAsCodaMCMC = TRUE)
mcmc.out2 <- runMCMC(sampler2, niter = 5000, nburnin = 1000, thin = 1,
                    nchains = 1, setSeed = TRUE, init = initList2,
                    summary = TRUE, samplesAsCodaMCMC = TRUE)
fit1_split <- as_bsim(models1, mcmc.out1)
fit2_split <- as_bsim(models2, mcmc.out2)
# }

Run the code above in your browser using DataLab