NMixEM: EM algorithm for a homoscedastic normal mixture

Description

This function computes ML estimates of the parameters of the $p$-dimensional $K$-component normal mixture using the EM algorithm

Usage

NMixEM(y, K, weight, mean, Sigma, toler=1e-5, maxiter=500)
## S3 method for class 'NMixEM':
print(x, \dots)

Arguments

vector (if $p=1$) matrix or data frame (if $p > 1$) with data. Rows correspond to observations, columns correspond to margins.

required number of mixture components.

weight

a numeric vector with initial mixture weights.

If not given, initial weights are all equal to $1/K$.

mean

vector or matrix of initial mixture means. For $p=1$ this should be a vector of length $K$, for $p>1$ this should be a $K\times p$ matrix with mixture means in rows.

Sigma

number or $p\times p$ matrix giving the initial variance/covariance matrix.

toler

tolerance to determine convergence.

maxiter

maximum number of iterations of the EM algorithm.

an object of class NMixEM.

...

additional arguments passed to the default print method.

Value

An object of class NMixEM which has the following components:
Knumber of mixture components
weightestimated mixture weights
meanestimated mixture means
Sigmaestimated covariance matrix
logliklog-likelihood value at fitted values
aicAkaike information criterion ($-2\hat{\ell} + 2\nu$), where $\hat{\ell}$ stands for the log-likelihood value at fitted values and $\nu$ for the number of free model parameters
bicBayesian (Schwarz) information criterion ($-2\hat{\ell} + \log(n)\,\nu$), where $\hat{\ell}$ stands for the log-likelihood value at fitted values and $\nu$ for the number of free model parameters, and $n$ for the sample size
iternumber of iterations of the EM algorithm used to get the solution
iter.loglikvalues of the log-likelihood at iterations of the EM algorithm
iter.Qfunvalues of the EM objective function at iterations of the EM algorithm
dimdimension $p$
nobsnumber of observations $n$

References

Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.

Examples

Run this code

## Estimates for 3-component mixture in  Anderson's iris data
## ==========================================================
data(iris, package="datasets")
summary(iris)

VARS <- names(iris)[1:4]
fit <- NMixEM(iris[, VARS], K = 3)
print(fit)

apply(subset(iris, Species == "versicolor")[, VARS], 2, mean)
apply(subset(iris, Species == "setosa")[, VARS], 2, mean)
apply(subset(iris, Species == "virginica")[, VARS], 2, mean)

## Estimates of 6-component mixture in Galaxy data
## ==================================================
data(Galaxy, package="mixAK")
summary(Galaxy)

fit2 <- NMixEM(Galaxy, K = 6)
y <- seq(5, 40, length=300)
fy <- dMVNmixture(y, weight=fit2$weight, mean=fit2$mean,
                     Sigma=rep(fit2$Sigma, fit2$K))
hist(Galaxy, prob=TRUE, breaks=seq(5, 40, by=0.5),
     main="", xlab="Velocity (km/sec)", col="sandybrown")
lines(y, fy, col="darkblue", lwd=2)

Run the code above in your browser using DataLab