sum_logL_GP_clust

During the prediction step of MagmaClust, an EM algorithm is used to compute
the maximum likelihood estimator of the hyper-parameters along with
mixture probabilities for the new individual/task. This function implements
the quantity that is maximised (i.e. a sum of Gaussian log-likelihoods,
weighted by their mixture probabilities). It can also be used to monitor the
EM algorithm when providing the 'prop_mixture' argument, for proper
penalisation of the full log-likelihood.

internal

An implementation for the multi-task Gaussian processes with common
mean framework. Two main algorithms, called 'Magma' and 'MagmaClust',
are available to perform predictions for supervised learning problems, in
particular for time series or any functional/continuous data applications.
The corresponding articles has been respectively proposed by Arthur Leroy,
Pierre Latouche, Benjamin Guedj and Servane Gey (2022)
<doi:10.1007/s10994-022-06172-1>, and Arthur Leroy, Pierre Latouche,
Benjamin Guedj and Servane Gey (2023) <https://jmlr.org/papers/v24/20-1321.html>.
Theses approaches leverage the learning of cluster-specific mean processes,
which are common across similar tasks, to provide enhanced prediction
performances (even far from data) at a linear computational cost (in
the number of tasks). 'MagmaClust' is a generalisation of 'Magma'
where the tasks are simultaneously clustered into groups, each being
associated to a specific mean process. User-oriented functions in the
package are decomposed into training, prediction and plotting
functions. Some basic features (classic kernels, training, prediction) of
standard Gaussian processes are also implemented.

Arthur Leroy

MagmaClustR

Clustering and Prediction using Multi-Task Gaussian Processes
with Common Mean

Pierre Latouche

Pierre Pathé

Alexia Grenouillat

Hugo Lelievre

sum_logL_GP_clust function

<dl><dt>hp</dt>
<dd>A tibble, data frame or named vector of hyper-parameters.</dd>
<dt>db</dt>
<dd>A tibble containing data we want to evaluate the logL on.
Required columns: Input, Output. Additional covariate columns are allowed.</dd>
<dt>mixture</dt>
<dd>A tibble or data frame, indicating the mixture probabilities
of each cluster for the new individual/task.</dd>
<dt>mean</dt>
<dd>A list of hyper-posterior mean parameters for all clusters.</dd>
<dt>kern</dt>
<dd>A kernel function.</dd>
<dt>post_cov</dt>
<dd>A list of hyper-posterior covariance parameters for all
clusters.</dd>
<dt>prop_mixture</dt>
<dd>A tibble or a named vector. Each name of column or
element should refer to a cluster. The value associated with each cluster
is a number between 0 and 1, corresponding to the mixture
proportions.</dd>
<dt>pen_diag</dt>
<dd>A jitter term that is added to the covariance matrix to avoid
numerical issues when inverting, in cases of nearly singular matrices.</dd></dl>

Arguments

Compute a mixture of Gaussian log-likelihoods — sum_logL_GP_clust

<dl>

<dt>hp</dt>
<dd>A tibble, data frame or named vector of hyper-parameters.</dd>


<dt>db</dt>
<dd>A tibble containing data we want to evaluate the logL on.
Required columns: Input, Output. Additional covariate columns are allowed.</dd>


<dt>mixture</dt>
<dd>A tibble or data frame, indicating the mixture probabilities
of each cluster for the new individual/task.</dd>


<dt>mean</dt>
<dd>A list of hyper-posterior mean parameters for all clusters.</dd>


<dt>kern</dt>
<dd>A kernel function.</dd>


<dt>post_cov</dt>
<dd>A list of hyper-posterior covariance parameters for all
clusters.</dd>


<dt>prop_mixture</dt>
<dd>A tibble or a named vector. Each name of column or
element should refer to a cluster. The value associated with each cluster
is a number between 0 and 1, corresponding to the mixture
proportions.</dd>


<dt>pen_diag</dt>
<dd>A jitter term that is added to the covariance matrix to avoid
numerical issues when inverting, in cases of nearly singular matrices.</dd>

</dl>

sum_logL_GP_clust: Compute a mixture of Gaussian log-likelihoods

Description

Usage

Value

Arguments

Examples