MoE_estep: E-step for MoEClust Models

Description

Softmax function to compute the responsibility matrix z and the log-likelihood for MoEClust models, with the aid of MoE_dens.

Usage

MoE_estep(data,
          mus,
          sigs,
          log.tau = 0L,
          Vinv = NULL,
          Dens = NULL)

Value

A list containing two elements:

z: A matrix with n rows and G columns containing the probability of cluster membership for each of n observations and G clusters.
loglik: The estimated log-likelihood, computed efficiently via logsumexp.

Arguments

data: If there are no expert network covariates, data should be a numeric matrix or data frame, wherein rows correspond to observations (n) and columns correspond to variables (d). If there are expert network covariates, this should be a list of length G containing matrices/data.frames of (multivariate) WLS residuals for each component.
mus: The mean for each of G components. If there is more than one component, this is a matrix whose k-th column is the mean of the k-th component of the mixture model. For the univariate models, this is a G-vector of means. In the presence of expert network covariates, all values should be equal to 0.
sigs: The variance component in the parameters list from the output to e.g. MoE_clust. The components of this list depend on the specification of modelName (see mclustVariance for details). The number of components G, the number of variables d, and the modelName are inferred from sigs.
log.tau: If covariates enter the gating network, an n times G matrix of mixing proportions, otherwise a G-vector of mixing proportions for the components of the mixture. Must be on the log-scale in both cases. The default of 0 effectively means densities (or log-densities) aren't scaled by the mixing proportions.
Vinv: An estimate of the reciprocal hypervolume of the data region. See the function noise_vol. Used only if an initial guess as to which observations are noise is supplied. Mixing proportion(s) must be included for the noise component also.
Dens: (Optional) A numeric matrix whose [i,k]-th entry is the log-density of observation i in component k, scaled by the mixing proportions, to which the softmax function is to be applied, typically obtained by MoE_dens but this is not necessary. If this is supplied, all other arguments are ignored, otherwise MoE_dens is called according to the other supplied arguments. If a vector is supplied, it will be coerced to a matrix with one row.

Author

Keefe Murphy - <keefe.murphy@mu.ie>

Examples

Run this code

data(ais)
hema   <- ais[,3:7]
model  <- MoE_clust(hema, G=3, gating= ~ BMI + sex, modelNames="EEE", network.data=ais)
Dens   <- MoE_dens(data=hema, mus=model$parameters$mean,
                   sigs=model$parameters$variance, log.tau=log(model$parameters$pro))

# Construct the z matrix and compute the log-likelihood
Estep  <- MoE_estep(Dens=Dens)
(ll    <- Estep$loglik)

# Check that the z matrix & classification are the same as those from the model
identical(max.col(Estep$z), as.integer(unname(model$classification))) #TRUE
identical(Estep$z, model$z)                                           #TRUE

# Call MoE_estep directly
Estep2 <- MoE_estep(data=hema, sigs=model$parameters$variance,
                    mus=model$parameters$mean, log.tau=log(model$parameters$pro))
identical(Estep2$loglik, ll)                                          #TRUE

# The same can be done for models with expert covariates &/or a noise component
# Note for models with expert covariates that the mean has to be supplied as 0,
# and the data has to be supplied as "resid.data"
m2     <- MoE_clust(hema, G=2, expert= ~ sex, modelNames="EVE", network.data=ais, tau0=0.1)
Estep3 <- MoE_estep(data=m2$resid.data, sigs=m2$parameters$variance, mus=0, 
                    log.tau=log(m2$parameters$pro), Vinv=m2$parameters$Vinv)

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

Author

See Also

Examples