EM_FMM_SemiSupervised_Initial: Quick Initializer for alpha, xi, and Mixture Parameters

Description

Provides rough initial estimates of the missingness parameters alpha and xi, together with mixture parameters pi, mu, and covariance matrices, using a lightweight EM-style routine. The covariance structure is chosen automatically based on Sigma_init:

If Sigma_init is a \(p \times p\) matrix, a shared (equal) covariance is used.
If Sigma_init is a list of length g of \(p \times p\) matrices or a \(p \times p \times g\) array, class-specific (unequal) covariances are used.
If Sigma_init is NULL, a shared covariance is estimated from the labeled data.

This function is intended as a fast, heuristic initializer rather than a final estimator for the mixed missingness model.

Usage

EM_FMM_SemiSupervised_Initial(
  Y_labelled,
  Z_labelled,
  Y_unlabelled,
  g = 2,
  pi_init = NULL,
  mu_init = NULL,
  Sigma_init = NULL,
  alpha_init = 0.01,
  warm_up_iter = 50,
  tol = 1e-06
)

Value

A list with elements:

pi - length-g vector of mixing proportions.
mu - list of g mean vectors.
Sigma - shared \(p \times p\) matrix (equal-Sigma) or list of g matrices (unequal-Sigma).
xi - length-2 numeric vector c(xi0, xi1) from the logistic MAR model.
alpha - estimated MCAR proportion.
gamma - \(n \times g\) responsibility matrix.
d2_yj - numeric vector of entropy-based scores used in the missingness model.

Arguments

Y_labelled: Numeric matrix of labeled observations (\(n_L \times p\)).
Z_labelled: Integer vector of class labels in 1:g for Y_labelled.
Y_unlabelled: Numeric matrix of unlabeled observations (\(n_U \times p\)).
g: Integer, number of mixture components (default 2).
pi_init: Optional numeric length-g vector of initial mixing proportions.
mu_init: Optional list of length g of initial mean vectors (each of length p).
Sigma_init: Optional initial covariance: a \(p \times p\) matrix (shared), or a list of g \(p \times p\) matrices, or a \(p \times p \times g\) array (class-specific).
alpha_init: Numeric in \((0,1)\), initial MCAR proportion (default 0.01).
warm_up_iter: Integer, number of warm-up EM iterations used to refine the quick initial estimates (default 50).
tol: Convergence tolerance on alpha (default 1e-6).