Learn R Programming

SSLfmm (version 0.1.0)

initialestimate: Initialize Parameters for a FMM from Labeled Subset

Description

Builds initial estimates \((\pi, \mu, \Sigma)\) for a g-component Gaussian mixture using only rows with observed labels in zm. Supports either a shared covariance (ncov = 1) or class-specific covariances (ncov = 2).

Usage

initialestimate(dat, zm, g, ncov = 2, ridge = 1e-06)

Value

A list with

  • pi: length-g vector of mixing proportions (summing to 1).

  • mu: p x g matrix of class means (column i is \(\mu_i\)).

  • sigma: if ncov = 1, a p x p shared covariance matrix; if ncov = 2, a p x p x g array of class-specific covariances.

Arguments

dat

A numeric matrix or data frame of features (n x p).

zm

Integer vector of length n with class labels in 1:g; use NA for unlabeled rows. Only labeled rows contribute to the initialization.

g

Integer, number of mixture components.

ncov

Integer, 1 for a shared covariance matrix, 2 for class-specific covariance matrices. Default 2.

ridge

Numeric, small diagonal ridge added to covariance(s) for numerical stability. Default 1e-6.

Details

If a class has zero or one labeled sample, its covariance is set to the global empirical covariance (from labeled data) with a small ridge. Class means for empty classes default to the global mean with a small jitter.

Examples

Run this code
set.seed(1)
n <- 50; p <- 3; g <- 2
X <- matrix(rnorm(n*p), n, p)
z <- sample(c(1:g, NA), n, replace = TRUE, prob = c(0.4, 0.4, 0.2))
init <- initialestimate(X, z, g, ncov = 2)
str(init)

Run the code above in your browser using DataLab