Learn R Programming

mixAK (version 4.2)

NMixEM: EM algorithm for a homoscedastic normal mixture

Description

This function computes ML estimates of the parameters of the $p$-dimensional $K$-component normal mixture using the EM algorithm

Usage

NMixEM(y, K, weight, mean, Sigma, toler=1e-5, maxiter=500)
"print"(x, ...)

Arguments

y
vector (if $p=1$) matrix or data frame (if $p > 1$) with data. Rows correspond to observations, columns correspond to margins.
K
required number of mixture components.
weight
a numeric vector with initial mixture weights.

If not given, initial weights are all equal to $1/K$.

mean
vector or matrix of initial mixture means. For $p=1$ this should be a vector of length $K$, for $p>1$ this should be a $K x p$ matrix with mixture means in rows.
Sigma
number or $p x p$ matrix giving the initial variance/covariance matrix.
toler
tolerance to determine convergence.
maxiter
maximum number of iterations of the EM algorithm.
x
an object of class NMixEM.
...
additional arguments passed to the default print method.

Value

An object of class NMixEM which has the following components:
K
number of mixture components
weight
estimated mixture weights
mean
estimated mixture means
Sigma
estimated covariance matrix
loglik
log-likelihood value at fitted values
aic
Akaike information criterion ($-2*loglik + 2*nu$), where $loglik$ stands for the log-likelihood value at fitted values and $nu$ for the number of free model parameters
bic
Bayesian (Schwarz) information criterion ($-2*loglik + log(n)*nu$), where $loglik$ stands for the log-likelihood value at fitted values and $nu$ for the number of free model parameters, and $n$ for the sample size
iter
number of iterations of the EM algorithm used to get the solution
iter.loglik
values of the log-likelihood at iterations of the EM algorithm
iter.Qfun
values of the EM objective function at iterations of the EM algorithm
dim
dimension $p$
nobs
number of observations $n$

References

Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.

Examples

Run this code
## Not run: 
# ## Estimates for 3-component mixture in  Anderson's iris data
# ## ==========================================================
# data(iris, package="datasets")
# summary(iris)
# 
# VARS <- names(iris)[1:4]
# fit <- NMixEM(iris[, VARS], K = 3)
# print(fit)
# 
# apply(subset(iris, Species == "versicolor")[, VARS], 2, mean)
# apply(subset(iris, Species == "setosa")[, VARS], 2, mean)
# apply(subset(iris, Species == "virginica")[, VARS], 2, mean)
# 
# ## Estimates of 6-component mixture in Galaxy data
# ## ==================================================
# data(Galaxy, package="mixAK")
# summary(Galaxy)
# 
# fit2 <- NMixEM(Galaxy, K = 6)
# y <- seq(5, 40, length=300)
# fy <- dMVNmixture(y, weight=fit2$weight, mean=fit2$mean,
#                      Sigma=rep(fit2$Sigma, fit2$K))
# hist(Galaxy, prob=TRUE, breaks=seq(5, 40, by=0.5),
#      main="", xlab="Velocity (km/sec)", col="sandybrown")
# lines(y, fy, col="darkblue", lwd=2)
# ## End(Not run)

Run the code above in your browser using DataLab