During the prediction step of MagmaClust, an EM algorithm is used to compute the maximum likelihood estimator of the hyper-parameters along with mixture probabilities for the new individual/task. This function implements the quantity that is maximised (i.e. a sum of Gaussian log-likelihoods, weighted by their mixture probabilities). It can also be used to monitor the EM algorithm when providing the 'prop_mixture' argument, for proper penalisation of the full log-likelihood.
sum_logL_GP_clust(
hp,
db,
mixture,
mean,
kern,
post_cov,
prop_mixture = NULL,
pen_diag
)
A number, expectation of mixture of Gaussian log-likelihoods in the prediction step of MagmaClust. This quantity is supposed to increase at each step of the EM algorithm, and can be used for monitoring the procedure.
A tibble, data frame or named vector of hyper-parameters.
A tibble containing data we want to evaluate the logL on. Required columns: Input, Output. Additional covariate columns are allowed.
A tibble or data frame, indicating the mixture probabilities of each cluster for the new individual/task.
A list of hyper-posterior mean parameters for all clusters.
A kernel function.
A list of hyper-posterior covariance parameters for all clusters.
A tibble or a named vector. Each name of column or element should refer to a cluster. The value associated with each cluster is a number between 0 and 1, corresponding to the mixture proportions.
A jitter term that is added to the covariance matrix to avoid numerical issues when inverting, in cases of nearly singular matrices.