This function serves as a wrapper to build, fit, and make prediction
for a Gaussian process model. It calls on functions gp, gp.mcmc,
gp.optim, gp.predict.
GaSP(
formula = ~1,
output,
input,
param,
smooth.est = FALSE,
input.new = NULL,
cov.model = list(family = "CH", form = "isotropic"),
model.fit = "Cauchy_prior",
prior = list(),
proposal = list(range = 0.35, tail = 2, nugget = 0.8, nu = 0.8),
nsample = 5000,
burnin = 1000,
opt = NULL,
bound = NULL,
dtype = "Euclidean",
verbose = TRUE
)a list containing the S4 object gp and prediction results
an object of formula class that specifies regressors; see formula for details.
a numerical vector including observations or outputs in a GaSP
a matrix including inputs in a GaSP
a list including values for regression parameters, covariance parameters, and nugget variance parameter. The specification of param should depend on the covariance model.
The regression parameters are denoted by coeff. Default value is \(\mathbf{0}\).
The marginal variance or partial sill is denoted by sig2. Default value is 1.
The nugget variance parameter is denoted by nugget for all covariance models. Default value is 0.
For the Confluent Hypergeometric class, range is used to denote the range parameter \(\beta\). tail is used to denote the tail decay parameter \(\alpha\). nu is used to denote the smoothness parameter \(\nu\).
For the generalized Cauchy class, range is used to denote the range parameter \(\phi\). tail is used to denote the tail decay parameter \(\alpha\). nu is used to denote the smoothness parameter \(\nu\).
For the Matérn class, range is used to denote the range parameter \(\phi\). nu is used to denote the smoothness parameter \(\nu\). When \(\nu=0.5\), the Matérn class corresponds to the exponential covariance.
For the powered-exponential class, range is used to denote the range parameter \(\phi\). nu is used to denote the smoothness parameter. When \(\nu=2\), the powered-exponential class corresponds to the Gaussian covariance.
a logical value indicating whether smoothness parameter will be estimated.
a matrix of new input locations
a list of two strings: family, form, where family indicates the family of covariance functions including the Confluent Hypergeometric class, the Matérn class, the Cauchy class, the powered-exponential class. form indicates the specific form of covariance structures including the isotropic form, tensor form, automatic relevance determination form.
The Confluent Hypergeometric correlation function is given by $$C(h) = \frac{\Gamma(\nu+\alpha)}{\Gamma(\nu)} \mathcal{U}\left(\alpha, 1-\nu, \left(\frac{h}{\beta}\right)^2\right),$$ where \(\alpha\) is the tail decay parameter. \(\beta\) is the range parameter. \(\nu\) is the smoothness parameter. \(\mathcal{U}(\cdot)\) is the confluent hypergeometric function of the second kind. For details about this covariance, see Ma and Bhadra (2019) at https://arxiv.org/abs/1911.05865.
The generalized Cauchy covariance is given by $$C(h) = \left\{ 1 + \left( \frac{h}{\phi} \right)^{\nu} \right\}^{-\alpha/\nu},$$ where \(\phi\) is the range parameter. \(\alpha\) is the tail decay parameter. \(\nu\) is the smoothness parameter with default value at 2.
The Matérn correlation function is given by $$C(h)=\frac{2^{1-\nu}}{\Gamma(\nu)} \left( \frac{h}{\phi} \right)^{\nu} \mathcal{K}_{\nu}\left( \frac{h}{\phi} \right),$$ where \(\phi\) is the range parameter. \(\nu\) is the smoothness parameter. \(\mathcal{K}_{\nu}(\cdot)\) is the modified Bessel function of the second kind of order \(\nu\).
The exponential correlation function is given by $$C(h)=\exp(-h/\phi),$$ where \(\phi\) is the range parameter. This is the Matérn correlation with \(\nu=0.5\).
The Matérn correlation with \(\nu=1.5\).
The Matérn correlation with \(\nu=2.5\).
The powered-exponential correlation function is given by $$C(h)=\exp\left\{-\left(\frac{h}{\phi}\right)^{\nu}\right\},$$ where \(\phi\) is the range parameter. \(\nu\) is the smoothness parameter.
The Gaussian correlation function is given by $$C(h)=\exp\left(-\frac{h^2}{\phi^2}\right),$$ where \(\phi\) is the range parameter.
This indicates the isotropic form of covariance functions. That is, $$C(\mathbf{h}) = C^0(\|\mathbf{h}\|; \boldsymbol \theta),$$ where \(\| \mathbf{h}\|\) denotes the Euclidean distance or the great circle distance for data on sphere. \(C^0(\cdot)\) denotes any isotropic covariance family specified in family.
This indicates the tensor product of correlation functions. That is, $$ C(\mathbf{h}) = \prod_{i=1}^d C^0(|h_i|; \boldsymbol \theta_i),$$ where \(d\) is the dimension of input space. \(h_i\) is the distance along the \(i\)th input dimension. This type of covariance structure has been often used in Gaussian process emulation for computer experiments.
This indicates the automatic relevance determination form. That is, $$C(\mathbf{h}) = C^0\left(\sqrt{\sum_{i=1}^d\frac{h_i^2}{\phi^2_i}}; \boldsymbol \theta \right),$$ where \(\phi_i\) denotes the range parameter along the \(i\)th input dimension.
a string indicating the choice of priors on correlation parameters:
This indicates that a fully Bayesian approach with objective priors is used for parameter estimation, where location-scale parameters are assigned with constant priors and correlation parameters are assigned with half-Cauchy priors (default).
This indicates that a fully Bayesian approach with objective priors is used
for parameter estimation, where location-scale parameters are assigned with constant priors and
correlation parameters are assigned with reference priors. This is only supported for isotropic
covariance functions. For details, see gp.mcmc.
This indicates that a fully Bayesian approach with subjective priors is used for parameter estimation, where location-scale parameters are assigned with constant priors and correlation parameters are assigned with beta priors parameterized as \(Beta(a, b, lb, ub)\). In the beta distribution, lb and ub are the support for correlation parameters, and they should be determined based on domain knowledge. a and b are two shape parameters with default values at 1, corresponding to the uniform prior over the support \((lb, ub)\).
This indicates that the maximum profile likelihood estimation (MPLE) is used.
This indicates that the maximum marginal likelihood estimation (MMLE) is used.
This indicates that the marginal/integrated posterior is maximized.
a list containing tuning parameters in prior distribution. This is used only if a subjective Bayes estimation method with informative priors is used.
a list containing tuning parameters in proposal distribution. This is used only if a Bayes estimation method is used.
an integer indicating the number of MCMC samples.
an integer indicating the burn-in period.
a list of arguments to setup the optim routine. Current implementation uses three arguments:
The optimization method: Nelder-Mead or L-BFGS-B.
The lower bound for parameters.
The upper bound for parameters.
Default value is NULL. Otherwise, it should be a list
containing the following elements depending on the covariance class:
a list of bounds for the nugget parameter.
It is a list containing lower bound lb and
upper bound ub with default value
list(lb=0, ub=Inf).
a list of bounds for the range parameter. It has default value
range=list(lb=0, ub=Inf) for the Confluent Hypergeometric covariance, the Matérn covariance, exponential covariance, Gaussian
covariance, powered-exponential covariance, and Cauchy covariance. The log of range parameterization
is used: \(\log(\phi)\).
a list of bounds for the tail decay parameter. It has default value
list(lb=0, ub=Inf)
a list of bounds for the smoothness parameter. It has default value
list(lb=0, ub=Inf) for the Confluent Hypergeometric covariance and the Matérn covariance.
when the powered-exponential or Cauchy class
is used, it has default value nu=list(lb=0, ub=2).
This can be achieved by specifying the lower bound in opt.
a string indicating the type of distance:
Euclidean distance is used. This is the default choice.
Great circle distance is used for data on sphere.
a logical value. If it is TRUE, the MCMC progress bar is shown.
Pulong Ma mpulong@gmail.com
GPBayes-package, gp, gp.mcmc, gp.optim, gp.predict
code = function(x){
y = (sin(pi*x/5) + 0.2*cos(4*pi*x/5))*(x<=9.6) + (x/10-1)*(x>9.6)
return(y)
}
n=100
input = seq(0, 20, length=n)
XX = seq(0, 20, length=99)
Ztrue = code(input)
set.seed(1234)
output = Ztrue + rnorm(length(Ztrue), sd=0.1)
# fitting a GaSP model with the objective Bayes approach
fit = GaSP(formula=~1, output, input,
param=list(range=3, nugget=0.1, nu=2.5),
smooth.est=FALSE, input.new=XX,
cov.model=list(family="matern", form="isotropic"),
proposal=list(range=.35, nugget=.8, nu=0.8),
dtype="Euclidean", model.fit="Cauchy_prior", nsample=50,
burnin=10, verbose=TRUE)
Run the code above in your browser using DataLab