gpPolar: Bayesian Single-Index Regression with Gaussian Process Link and One-to-One Polar Transformation

Description

Fits a single–index model $Y_i \sim \mathcal{N}(f(X_i'\theta), \sigma^2), i = 1,\cdots,n$, where the index $\theta$ is specified and computed via a one-to-one polar transformation, and the link $f(\cdot)$ is represented with a Gaussian process.

Usage

gpPolar(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)
gpPolar_setup(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)
gpPolarHigh(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)
gpPolarHigh_setup(
  formula,
  data,
  prior = NULL,
  init = NULL,
  monitors = NULL,
  niter = 10000,
  nburnin = 1000,
  thin = 1,
  nchain = 1,
  setSeed = FALSE
)

Value

A list typically containing:

coefficients: Mean estimates of index vector. Return list of model_setup does not include it.
ses: Mean standard error of index vector. Return list of model_setup does not include it.
residuals: Residuals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.values: Mean estimates of fitted values. Return list of model_setup does not include it.
linear.predictors: Mean estimates of single-index values. Return list of model_setup does not include it.
gof: Goodness-of-fit. Return list of model_setup function does not include it.
samples: Posterior draws of variables for computing fitted values of the model, including $\theta$, $\sigma^2$. Return list of model_setup does not include it.
input: List of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModel: Nimble model object.
defSampler: Nimble sampler object.
modelName: Name of the model.

Arguments

formula: an object of class formula. Interaction term is not allowed for single-index model.
data: an data frame.
prior: Optional named list of prior settings. Further descriptions are in Details section.
init: Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section.
monitors: Optional character vector of additional monitor nodes. To check the names of the nodes, fit the model_setup function and then inspect the variable names stored in the model object using getVarMonitor.
niter: Integer. Total MCMC iterations (default 10000).
nburnin: Integer. Burn-in iterations (default 1000).
thin: Integer. Thinning for monitors (default 1).
nchain: Integer. Number of MCMC chains (default 1).
setSeed: Logical or numeric argument. Further details are provided in runMCMC setSeed argument.

Details

Model The single–index model is specified as $Y_i = f(X_i'{\theta}) + \epsilon_i$, where the index vector $\theta$ lies on the unit sphere with ($\|\theta\|_2=1$) with non-zero first component to ensure identifiability and is parameterized via a one-to-one polar transformation with angle $\psi_1,...,\psi_{p-1}$.

The mapping is $$ \begin{aligned} \theta_1 &= \sin(\psi_1),\\ \theta_i &= \Big(\prod_{j=1}^{i-1}\cos(\psi_j)\Big)\sin(\psi_i), \quad i=2,\dots,p-1,\\ \theta_p &= \prod_{j=1}^{p-1}\cos(\psi_j). \end{aligned} $$ The vector is then scaled to unit length.

Sampling is performed on the angular parameters $\theta$ defining the index vector. The link function $f(\cdot)$ is modeled by a Gaussian process prior with zero mean and an Ornstein–Uhlenbeck (OU) covariance kernel $\exp(-\kappa \cdot |t_i - t_j|), i, j = 1,\ldots, n$, where $\kappa$ is a bandwidth (smoothness) parameter and $t_i, t_j$ is index value ($t_i = X_i'\theta$).

Priors

$\psi$ is $p-1$ dimension of polar angle of index vector and re-scaled Beta distribution on $[0, \pi]$ is assigned for prior.
Prior for $\kappa$ (bandwidth parameter) is discrete uniform of equally spaced grid points in $[\kappa_{\text{min}}, \kappa_{\text{max}}$].
Inverse gamma prior on $\sigma^2$ with shape parameter $a_\sigma$ and rate parameter $b_\sigma$.

Sampling For gpPolar, $\theta$ is sampled by Metropolis-Hastings and updated with $f$, $\kappa$ is chosen by grid search method that maximizes likelihood, $\sigma^2$ are sampled from their full conditional distributions using Gibbs sampling. Since $\kappa$ is sampled by grid search, more than 5 dimension of variables gpPolarHigh is recommended. For gpPolarHigh, all sampling parameters' samplers are assigned by nimble.

Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.

Index vector: Only shape parameter index_psi_alpha of $p-1$ dimension vector is needed since rate parameters is computed to satisfy $\theta_{j, \text{MAP}}$. By default, the shape parameter for each element of the polar vector is set to 5000.
Link function: link_kappa_min is minimum value of kappa (default 0.5), link_kappa_max is maximum value of kappa (default 4), and link_kappa_grid_width is space (default 0.1).
Error variance (sigma2): An Inverse gamma prior is assigned to $\sigma^2$ where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution. (default sigma2_shape = 2, sigma2_rate = 0.01)

Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.

Index vector: Initial vector of polar angle index_psi with $p-1$ dimension. Each element of angle is between 0 and $\pi$.
Link function: Initial scalar scale parameter of covariance kernel link_kappa. (default: 2)
Error variance (sigma2): Initial scalar error variance. (default: 0.01)

References

Dhara, K., Lipsitz, S., Pati, D., & Sinha, D. (2019). A new Bayesian single index model with or without covariates missing at random. Bayesian analysis, 15(3), 759.

Examples

Run this code

# \donttest{
library(MASS)
N <- 100    # Sample Size
p <- 3
mu <- c(0,0,0)
rho <- 0
Cx <- rbind(c(1,rho,rho), c(rho,1,rho), c(rho, rho,1))
X <- mvrnorm(n = N, mu=mu, Sigma=Cx, tol=1e-8)
alpha <- c(1,1,1)
alpha <- alpha/sqrt(sum(alpha^2))
z <- matrix(0,N)
z <- X %*% alpha
z <- z[,1]
sigma <- 0.3
f <- exp(z)
y <- f + rnorm(N, 0, sd=sigma) # adding noise
y <- y-mean(y)
f_all <- f
x_all <- X
y_all <- y
simdata <- cbind(x_all, y, f)
simdata <- as.data.frame(simdata)
colnames(simdata) = c('x1', 'x2', 'x3', 'y','f')

# One tool version
fit1 <- gpPolar(y ~ x1 + x2 + x3, data = simdata,
                niter = 5000, nburnin = 1000, nchain = 1)
fit2 <- gpPolarHigh(y ~ x1 + x2 + x3, data = simdata,
                    niter = 5000, nburnin = 1000, nchain = 1)

# Split version
models1 <- gpPolar_setup(y ~ x1 + x2 + x3, data = simdata)
models2 <- gpPolarHigh_setup(y ~ x1 + x2 + x3, data = simdata)
Ccompile1 <- compileModelAndMCMC(models1)
Ccompile2 <- compileModelAndMCMC(models2)
sampler1 <- get_sampler(Ccompile1)
sampler2 <- get_sampler(Ccompile2)
initList1 <- getInit(models1)
initList2 <- getInit(models2)
mcmc.out1 <- runMCMC(sampler1, niter = 5000, nburnin = 1000, thin = 1,
                    nchains = 1, setSeed = TRUE, init = initList1,
                    summary = TRUE, samplesAsCodaMCMC = TRUE)
mcmc.out2 <- runMCMC(sampler2, niter = 5000, nburnin = 1000, thin = 1,
                    nchains = 1, setSeed = TRUE, init = initList2,
                    summary = TRUE, samplesAsCodaMCMC = TRUE)
fit1_split <- as_bsim(models1, mcmc.out1)
fit2_split <- as_bsim(models2, mcmc.out2)
# }

Run the code above in your browser using DataLab