NMixEM: EM algorithm for a homoscedastic normal mixture

Description

This function computes ML estimates of the parameters of the $p$-dimensional $K$-component normal mixture using the EM algorithm

Usage

NMixEM(y, K, weight, mean, Sigma, toler=1e-5, maxiter=500)
"print"(x, ...)

Arguments

vector (if $p=1$) matrix or data frame (if $p > 1$) with data. Rows correspond to observations, columns correspond to margins.

required number of mixture components.

weight

a numeric vector with initial mixture weights.

If not given, initial weights are all equal to $1/K$.

mean

vector or matrix of initial mixture means. For $p=1$ this should be a vector of length $K$, for $p>1$ this should be a $K x p$ matrix with mixture means in rows.

Sigma

number or $p x p$ matrix giving the initial variance/covariance matrix.

toler

tolerance to determine convergence.

maxiter

maximum number of iterations of the EM algorithm.

an object of class NMixEM.

...

additional arguments passed to the default print method.

Value

K: number of mixture components
weight: estimated mixture weights
mean: estimated mixture means
Sigma: estimated covariance matrix
loglik: log-likelihood value at fitted values
aic: Akaike information criterion ($-2*loglik + 2*nu$), where $loglik$ stands for the log-likelihood value at fitted values and $nu$ for the number of free model parameters
bic: Bayesian (Schwarz) information criterion ($-2*loglik + log(n)*nu$), where $loglik$ stands for the log-likelihood value at fitted values and $nu$ for the number of free model parameters, and $n$ for the sample size
iter: number of iterations of the EM algorithm used to get the solution
iter.loglik: values of the log-likelihood at iterations of the EM algorithm
iter.Qfun: values of the EM objective function at iterations of the EM algorithm
dim: dimension $p$
nobs: number of observations $n$

References

Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.

Examples

Run this code

## Not run: 
# ## Estimates for 3-component mixture in  Anderson's iris data
# ## ==========================================================
# data(iris, package="datasets")
# summary(iris)
# 
# VARS <- names(iris)[1:4]
# fit <- NMixEM(iris[, VARS], K = 3)
# print(fit)
# 
# apply(subset(iris, Species == "versicolor")[, VARS], 2, mean)
# apply(subset(iris, Species == "setosa")[, VARS], 2, mean)
# apply(subset(iris, Species == "virginica")[, VARS], 2, mean)
# 
# ## Estimates of 6-component mixture in Galaxy data
# ## ==================================================
# data(Galaxy, package="mixAK")
# summary(Galaxy)
# 
# fit2 <- NMixEM(Galaxy, K = 6)
# y <- seq(5, 40, length=300)
# fy <- dMVNmixture(y, weight=fit2$weight, mean=fit2$mean,
#                      Sigma=rep(fit2$Sigma, fit2$K))
# hist(Galaxy, prob=TRUE, breaks=seq(5, 40, by=0.5),
#      main="", xlab="Velocity (km/sec)", col="sandybrown")
# lines(y, fy, col="darkblue", lwd=2)
# ## End(Not run)

Run the code above in your browser using DataLab