Set parameters for optimization of the covariance parameters of a GPModel
# S3 method for GPModel
set_optim_params(gp_model, params = list())A GPModel
A GPModel
A list with parameters for the estimation / optimization
optimizer_cov: string (default = "gradient_descent").
Optimizer used for estimating covariance parameters.
Options: "gradient_descent", "fisher_scoring", "nelder_mead", "bfgs", "adam".
If there are additional auxiliary parameters for non-Gaussian likelihoods,
'optimizer_cov' is also used for those
optimizer_coef: string (default = "wls" for Gaussian likelihoods and "gradient_descent" for other likelihoods).
Optimizer used for estimating linear regression coefficients, if there are any
(for the GPBoost algorithm there are usually none).
Options: "gradient_descent", "wls", "nelder_mead", "bfgs", "adam". Gradient descent steps are done simultaneously
with gradient descent steps for the covariance parameters.
"wls" refers to doing coordinate descent for the regression coefficients using weighted least squares.
If 'optimizer_cov' is set to "nelder_mead", "bfgs", or "adam",
'optimizer_coef' is automatically also set to the same value.
maxit: integer (default = 1000).
Maximal number of iterations for optimization algorithm
delta_rel_conv: numeric (default = 1E-6 except for "nelder_mead" for which the default is 1E-8).
Convergence tolerance. The algorithm stops if the relative change
in either the (approximate) log-likelihood or the parameters is below this value.
For "bfgs" and "adam", the L2 norm of the gradient is used instead of the relative change in the log-likelihood.
If < 0, internal default values are used
convergence_criterion: string (default = "relative_change_in_log_likelihood").
The convergence criterion used for terminating the optimization algorithm.
Options: "relative_change_in_log_likelihood" or "relative_change_in_parameters"
init_coef: vector with numeric elements (default = NULL).
Initial values for the regression coefficients (if there are any, can be NULL)
init_cov_pars: vector with numeric elements (default = NULL).
Initial values for covariance parameters of Gaussian process and
random effects (can be NULL)
lr_coef: numeric (default = 0.1).
Learning rate for fixed effect regression coefficients if gradient descent is used
lr_cov: numeric (default = 0.1 for "gradient_descent" and 1. for "fisher_scoring").
Initial learning rate for covariance parameters.
If lr_cov < 0, internal default values are used.
If there are additional auxiliary parameters for non-Gaussian likelihoods,
'lr_cov' is also used for those
use_nesterov_acc: boolean (default = TRUE).
If TRUE Nesterov acceleration is used.
This is used only for gradient descent
acc_rate_coef: numeric (default = 0.5).
Acceleration rate for regression coefficients (if there are any)
for Nesterov acceleration
acc_rate_cov: numeric (default = 0.5).
Acceleration rate for covariance parameters for Nesterov acceleration
momentum_offset: integer (Default = 2).
Number of iterations for which no momentum is applied in the beginning.
trace: boolean (default = FALSE).
If TRUE, information on the progress of the parameter
optimization is printed
std_dev: boolean (default = TRUE).
If TRUE, approximate standard deviations are calculated for the covariance and linear regression parameters
(= square root of diagonal of the inverse Fisher information for Gaussian likelihoods and
square root of diagonal of a numerically approximated inverse Hessian for non-Gaussian likelihoods)
init_aux_pars: vector with numeric elements (default = NULL).
Initial values for additional parameters for non-Gaussian likelihoods
(e.g., shape parameter of gamma likelihood)
estimate_aux_pars: boolean (default = TRUE).
If TRUE, additional parameters for non-Gaussian likelihoods
are also estimated (e.g., shape parameter of gamma likelihood)
cg_max_num_it: integer (default = 1000).
Maximal number of iterations for conjugate gradient algorithms
cg_max_num_it_tridiag: integer (default = 1000).
Maximal number of iterations for conjugate gradient algorithm
when being run as Lanczos algorithm for tridiagonalization
cg_delta_conv: numeric (default = 1E-2).
Tolerance level for L2 norm of residuals for checking convergence
in conjugate gradient algorithm when being used for parameter estimation
num_rand_vec_trace: integer (default = 50).
Number of random vectors (e.g., Rademacher) for stochastic approximation of the trace of a matrix
reuse_rand_vec_trace: boolean (default = TRUE).
If true, random vectors (e.g., Rademacher) for stochastic approximation
of the trace of a matrix are sampled only once at the beginning of
Newton's method for finding the mode in the Laplace approximation
and are then reused in later trace approximations.
Otherwise they are sampled every time a trace is calculated
seed_rand_vec_trace: integer (default = 1).
Seed number to generate random vectors (e.g., Rademacher)
piv_chol_rank: integer (default = 50).
Rank of the pivoted Cholesky decomposition used as
preconditioner in conjugate gradient algorithms
cg_preconditioner_type: string
(default = "Sigma_inv_plus_BtWB" for non-Gaussian likelihoods and a Vecchia-Laplace approximation).
Type of preconditioner used for conjugate gradient algorithms.
Options for non-Gaussian likelihoods and a Vecchia-Laplace approximation:
"piv_chol_on_Sigma": (Lk * Lk^T + W^-1) as preconditioner for inverting (B^-1 * D * B^-T + W^-1), where Lk is a low-rank pivoted Cholesky approximation for Sigma and B^-1 * D * B^-T approx= Sigma
"Sigma_inv_plus_BtWB": (B^T * (D^-1 + W) * B) as preconditioner for inverting (B^T * D^-1 * B + W), where B^T * D^-1 * B approx= Sigma^-1
Fabio Sigrist
# \donttest{
data(GPBoost_data, package = "gpboost")
gp_model <- GPModel(group_data = group_data, likelihood="gaussian")
set_optim_params(gp_model, params=list(optimizer_cov="nelder_mead"))
# }
Run the code above in your browser using DataLab