GPModel
objectCreate a GPModel
which contains a Gaussian process and / or mixed effects model with grouped random effects
GPModel(group_data = NULL, group_rand_coef_data = NULL,
ind_effect_group_rand_coef = NULL,
drop_intercept_group_rand_effect = NULL, gp_coords = NULL,
gp_rand_coef_data = NULL, cov_function = "exponential",
cov_fct_shape = 0, cov_fct_taper_range = 1, vecchia_approx = FALSE,
num_neighbors = 30L, vecchia_ordering = "none",
vecchia_pred_type = "order_obs_first_cond_obs_only",
num_neighbors_pred = num_neighbors, cluster_ids = NULL,
free_raw_data = FALSE, likelihood = "gaussian")
A GPModel
containing ontains a Gaussian process and / or mixed effects model with grouped random effects
A vector
or matrix
whose columns are categorical grouping variables.
The elements being group levels defining grouped random effects.
The elements of 'group_data' can be integer, double, or character.
The number of columns corresponds to the number of grouped (intercept) random effects
A vector
or matrix
with numeric covariate data
for grouped random coefficients
A vector
with integer indices that
indicate the corresponding categorical grouping variable (=columns) in 'group_data' for
every covariate in 'group_rand_coef_data'. Counting starts at 1.
The length of this index vector must equal the number of covariates in 'group_rand_coef_data'.
For instance, c(1,1,2) means that the first two covariates (=first two columns) in 'group_rand_coef_data'
have random coefficients corresponding to the first categorical grouping variable (=first column) in 'group_data',
and the third covariate (=third column) in 'group_rand_coef_data' has a random coefficient
corresponding to the second grouping variable (=second column) in 'group_data'
A vector
of type logical
(boolean).
Indicates whether intercept random effects are dropped (only for random coefficients).
If drop_intercept_group_rand_effect[k] is TRUE, the intercept random effect number k is dropped / not included.
Only random effects with random slopes can be dropped.
A matrix
with numeric coordinates (= inputs / features) for defining Gaussian processes
A vector
or matrix
with numeric covariate data for
Gaussian process random coefficients
A string
specifying the covariance function for the Gaussian process.
The following covariance functions are available:
"exponential", "gaussian", "matern", "powered_exponential", "wendland", and "exponential_tapered".
For "exponential", "gaussian", and "powered_exponential", we follow the notation and parametrization of Diggle and Ribeiro (2007).
For "matern", we follow the notation of Rassmusen and Williams (2006).
For "wendland", we follow the notation of Bevilacqua et al. (2019).
A covariance function with the suffix "_tapered" refers to a covariance function that is multiplied by
a compactly supported Wendland covariance function (= tapering)
A numeric
specifying the shape parameter of the covariance function
(=smoothness parameter for Matern and Wendland covariance). For the Wendland covariance function,
we follow the notation of Bevilacqua et al. (2019)).
This parameter is irrelevant for some covariance functions such as the exponential or Gaussian
A numeric
specifying the range parameter of the Wendland covariance function / taper.
We follow the notation of Bevilacqua et al. (2019)
A boolean
. If TRUE, the Vecchia approximation is used
An integer
specifying the number of neighbors for the Vecchia approximation
A string
specifying the ordering used in the Vecchia approximation.
"none" means the default ordering is used, "random" uses a random ordering
A string
specifying the type of Vecchia approximation used for making predictions.
"order_obs_first_cond_obs_only" = observed data is ordered first and the neighbors are only observed points,
"order_obs_first_cond_all" = observed data is ordered first and the neighbors are selected among all points
(observed + predicted), "order_pred_first" = predicted data is ordered first for making predictions,
"latent_order_obs_first_cond_obs_only" = Vecchia approximation for the latent process and observed data is
ordered first and neighbors are only observed points, "latent_order_obs_first_cond_all" = Vecchia approximation
for the latent process and observed data is ordered first and neighbors are selected among all points
an integer
specifying the number of neighbors for the Vecchia approximation
for making predictions
A vector
with elements indicating independent realizations of
random effects / Gaussian processes (same values = same process realization).
The elements of 'cluster_ids' can be integer, double, or character.
A boolean
. If TRUE, the data (groups, coordinates, covariate data for random coefficients)
is freed in R after initialization
A string
specifying the likelihood function (distribution) of the response variable
Default = "gaussian"
Fabio Sigrist
# See https://github.com/fabsig/GPBoost/tree/master/R-package for more examples
data(GPBoost_data, package = "gpboost")
#--------------------Grouped random effects model: single-level random effect----------------
gp_model <- GPModel(group_data = group_data[,1], likelihood="gaussian")
#--------------------Gaussian process model----------------
gp_model <- GPModel(gp_coords = coords, cov_function = "exponential",
likelihood="gaussian")
#--------------------Combine Gaussian process with grouped random effects----------------
gp_model <- GPModel(group_data = group_data,
gp_coords = coords, cov_function = "exponential",
likelihood="gaussian")
Run the code above in your browser using DataLab