gbsar: Bayesian Shape-Restricted Spectral Analysis for Generalized Partial Linear Models

Description

This function fits a Bayesian generalized partial linear regression model to estimate shape-restricted functions using a spectral analysis of Gaussian process priors.

Usage

gbsar(y, w, x, xmin, xmax, n, family, link, nbasis, nint, mcmc = list(), prior = list(),
shape = c('Free','Increasing','Decreasing','IncreasingConvex','DecreasingConcave',
          'IncreasingConcave','DecreasingConvex','IncreasingS','DecreasingS',
          'IncreasingRotatedS','DecreasingRotatedS','InvertedU','Ushape'),
marginal.likelihood = TRUE)

Arguments

a vector of response values.

a vector or matrix giving covariates of dimension n times ndimw excluding intercept for a parametric component (w can be omitted)

a vector or matrix giving covariates of dimension n times K for nonparametric components.

xmin

a vector or scalar giving user-specific minimum values of x. The default values are minimum values of x.

xmax

a vector or scalar giving user-specific maximum values of x. The default values are maximum values of x.

an integer vector containing the number of trials for binomial data.

family

a description of the error distribution to be used in the model: The families contains bernoulli (“bernoulli”), poisson (“poisson”), negative-binomial (“negative.binomial”), poisson-gamma mixture (“poisson.gamma”).

link

a description of the link function to be used in the model.

nbasis

number of cosine basis functions.

nint

number of grid points where the unknown function is evaluated for plotting. The default is 200.

mcmc

a list giving the MCMC parameters. The list includes the following integers (with default values in parentheses): nblow0 (1000) giving the number of initialization period for adaptive metropolis, maxmodmet (5) giving the maximum number of times to modify metropolis, nblow (10000) giving the number of MCMC in transition period, nskip (10) giving the thinning interval, smcmc (1000) giving the number of MCMC for analysis, and ndisp (1000) giving the number of saved draws to be displayed on screen (the function reports on the screen when every ndisp iterations have been carried out).

prior

a list giving the prior information. The list includes the following parameters (default values specify the non-informative prior): iflagprior choosing a smoothing prior for spectral coefficients (iflagprior=0 assigns T-Smoother prior (default), iflagprior=1 chooses Lasso-Smoother prior), theta_m0, theta0_m0 and theta0_s0 giving the hyperparameters for prior distribution of the spectral coefficients (theta0_m0 and theta0_s0 are used when the functions have shape-restriction), tau2_m0, tau2_s0 and w0 giving the prior mean and standard deviation of smoothing prior (When iflagprior=1, tau2_m0 is only used as the hyperparameter), beta_m0 and beta_v0 giving the hyperparameters of the multivariate normal distribution for parametric part including intercept, alpha_m0 and alpha_s0 giving the prior mean and standard deviation of the truncated normal prior distribution for the constant of integration, iflagpsi determining the prior of slope for logisitic function in S or U shaped (iflagpsi=1 (default), slope $\psi$ is sampled and iflagpsi=0, $\psi$ is fixed), psifixed giving initial value (iflagpsi=1) or fixed value (iflagpsi=0) of slope, omega_m0 and omega_s0 giving the prior mean and standard deviation of the truncated normal prior distribution for the inflection point of S or U shaped function, kappa_m0 and kappa_v0 giving the prior mean and variance of the gammal prior distribution for dispersion parameter (negative-binomial).

shape

a vector giving types of shape restriction.

marginal.likelihood

a logical variable indicating whether the log marginal likelihood is calculated. The methods of Gelfand and Dey (1994) and Newton and Raftery (1994) are used.

Value

An object of class bsam representing the Bayesian spectral analysis model fit. Generic functions such as print, fitted and plot have methods to show the results of the fit.

The MCMC samples of the parameters in the model are stored in the list mcmc.draws, the posterior samples of the fitted values are stored in the list fit.draws, and the MCMC samples for the log marginal likelihood are saved in the list loglik.draws. The output list also includes the following objects:

post.est

posterior estimates for all parameters in the model.

lmarg.gd

log marginal likelihood using Gelfand-Dey method.

lmarg.nr

log marginal likelihood using Netwon-Raftery method, which is biased.

family

the family object used.

link

the link object used.

call

the matched call.

mcmctime

running time of Markov chain from system.time().

Details

This generic function fits a Bayesian generalized partial linear regression models for estimating shape-restricted functions using Gaussian process priors. For enforcing shape-restrictions, they assumed that the derivatives of the functions are squares of Gaussian processes.

Let $y_i$ and $w_i$ be the response and the vector of parametric predictors, respectively. Further, let $x_{i,k}$ be the covariate related to the response through an unknown shape-restricted function. The model for estimating shape-restricted functions is as follows.

$$y_i | \mu_i \sim F(\mu_i), $$ $$g(\mu_i) = w_i^T\beta + \sum_{k=1}^K f_k(x_{i,k}), ~ i=1,\ldots,n, $$ where $g(\cdot)$ is a link function and $f_k$ is an unknown nonlinear function of the scalar $x_{i,k} \in [0,1]$.

The prior of function without shape restriction is: $$f(x) = Z(x), $$ where $Z$ is a second-order Gaussian process with mean function equal to zero and covariance function $\nu(s,t) = E[Z(s)Z(t)]$ for $s, t \in [0, 1]$. The Gaussian process is expressed with the spectral representation based on cosine basis functions: $$Z(x) = \sum_{j=0}^\infty \theta_j\varphi_j(x)$$ $$\varphi_0(x) = 1 ~~ \code{and} ~~ \varphi_j(x) = \sqrt{2}\cos(\pi j x), ~ j \ge 1, ~ 0 \le x \le 1$$

The shape-restricted functions are modeled by assuming the $q$th derivatives of $f$ are squares of Gaussian processes: $$f^{(q)}(x) = \delta Z^2(x)h(x), ~~ \delta \in \{1, -1\}, ~~ q \in \{1, 2\},$$ where $h$ is the squish function. For monotonic, monotonic convex, and concave functions, $h(x)=1$, while for S and U shaped functions, $h$ is defined by $$h(x) = \frac{1 - \exp[\psi(x - \omega)]}{1 + \exp[\psi(x - \omega)]}, ~~ \psi > 0, ~~ 0 < \omega < 1$$

For the spectral coefficients of functions without shape constraints, the following prior is used (The intercept is included in $\beta$): $$\theta_j | \tau, \gamma \sim N(0, \tau^2\exp[-j\gamma]), ~ j \ge 1$$

The priors for the spectral coefficients of shape restricted functions are: $$\theta_0 | \sim N(m_{\theta_0}, v^2_{\theta_0}), \quad \theta_j | \tau, \gamma \sim N(m_{\theta_j}, \tau^2\exp[-j\gamma]), ~ j \ge 1$$

To complete the model specification, the following prior is assumed for $\beta$: $$\beta | \sim N(m_{0,\beta}, V_{0,\beta})$$

References

Albert, J. H. and Chib, S. (1993) Bayesian Analysis of Binary and Polychotomous Response Data. Journal of the American Statistical Association, 88, 669-679.

Holmes, C. C. and Held, L. (2006) Bayesian Auxiliary Variables Models for Binary and Multinomial Regression. Bayesian Analysis, 1, 145-168.

Lenk, P. and Choi, T. (2017) Bayesian Analysis of Shape-Restricted Functions using Gaussian Process Priors. Statistica Sinica, 27, 43-69.

Gelfand, A. E. and Dey, K. K. (1994) Bayesian model choice: asymptotics and exact calculations. Journal of the Royal Statistical Society. Series B - Statistical Methodology, 56, 501-514.

Newton, M. A. and Raftery, A. E. (1994) Approximate Bayesian inference with the weighted likelihood bootstrap (with discussion). Journal of the Royal Statistical Society. Series B - Statistical Methodology, 56, 3-48.

Examples

Run this code

# NOT RUN {
	###########################
	# Probit Regression Model #
	###########################

	# Simulate data
	  set.seed(1)

	  n <- 100
	  x <- runif(n)
	  y <- log(1 + 10*x) + rnorm(n, sd = 1)

	  # Number of cosine basis functions
	  nbasis <- 50

	  # Fit the model with default priors and mcmc parameters
	  fout <- gbsar(y = y, x = x, family = binomial(link = "probit"),
	                nbasis = nbasis, shape = 'IncreasingConcave')

	  # Summary
	  print(fout)

	  # fitted values
	  fit=fitted(fout)

	  # Plot
	  plot(fit,'topleft',ask=TRUE)

	######################################
	# Logistic Additive Regression Model #
	######################################

	# Wage-Union data
	  data(wage.union); attach(wage.union)

	  race[race==1 | race==2]=0
	  race[race==3]=1

	  y <- union
	  w <- cbind(race,sex,south)
	  x <- cbind(wage,education,age)

	  # mcmc parameters
	  mcmc <- list(blow0 = 10000,
	               blow = 10000,
	               nskip = 10,
	               smcmc = 1000,
	               ndisp = 1000,
	               maxmodmet = 10)

	  foutGBSAR <- gbsar(y = y, w = w, x = x, family = 'binomial',
	                     link = 'logit', nbasis = 50, mcmc = mcmc,
                       shape = c('Free','Decreasing','Increasing'))

	  # fitted values
	  fitGBSAR <- fitted(foutGBSAR)

	  # Plot
	  plot(fitGBSAR, ask = TRUE)

# }

Run the code above in your browser using DataLab