nor1mix (version 1.2-3)

llnorMix: Likelihood, Parametrization and EM-Steps For 1D Normal Mixtures

Description

These functions work with an almost unconstrained parametrization of univariate normal mixtures.

llnorMix(p, *)

computes the log likelihood,

obj <- par2norMix(p)

maps parameter vector p to a norMix object obj,

p <- nM2par(obj)

maps from a norMix object obj to parameter vector p,

where p is always a parameter vector in our parametrization.

Partly for didactical reasons, the following functions provide the basic ingredients for the EM algorithm (see also norMixEM) to parameter estimation:

estep.nm(x, obj, p)

computes 1 E-step for the data x, given either a "norMix" object obj or parameter vector p.

mstep.nm(x, z)

computes 1 M-step for the data x and the probability matrix z.

emstep.nm(x, obj)

computes 1 E- and 1 M-step for the data x and the "norMix" object obj.

where again, p is a parameter vector in our parametrization, x is the (univariate) data, and z is a \(n \times m\) matrix of (posterior) conditional probabilities, and \(\theta\) is the full parameter vector of the mixture model.

Usage

llnorMix(p, x, m = (length(p) + 1)/3)

par2norMix(p, name = sprintf("{from %s}", deparse(substitute(p))[1])) nM2par(obj)

estep.nm(x, obj, par) mstep.nm(x, z) emstep.nm(x, obj)

Arguments

p, par

numeric vector: our parametrization of a univariate normal mixture, see details.

x

numeric: the data for which the likelihood is to be computed.

m

integer number of mixture components; this is not to be changed for a given p.

name

(for par2norMix():) a name for the "norMix" object that is returned.

obj

a "norMix" object, see norMix.

z

a \(n \times m\) matrix of (posterior) conditional probabilities, \(z_{ij}= P(x_i \in C_j | \theta)\), where \(C_j\) denotes the \(j\)-th group (“cluster”).

Value

llnorMix() returns a number, namely the log-likelihood.

par2norMix() returns "norMix" object, see norMix.

nM2par() returns the parameter vector \(\theta\) of length \(3k-1\).

estep.nm() returns z, the matrix of (conditional) probabilities.

mstep.nm() returns the model parameters as a list with components w, mu, and sigma, corresponding to the arguments of norMix(). (and see the 'Examples' on using do.call(norMix, *) with it.)

emstep.nm() returns an updated "norMix" object.

Details

We use a parametrization of a (finite) univariate normal mixture which is particularly apt for likelihood maximization, namely, one whose parameter space is almost a full \(\mathbf{I\hskip-0.22em R}^m\), \(m = 3k-1\).

For a \(k\)-component mixture, we map to and from a parameter vector \(\theta\) (== p as R-vector) of length \(3k-1\). For mixture density $$\sum_{j=1}^k \pi_j \phi((t - \mu_j)/\sigma_j)/\sigma_j,$$ we logit-transform the \(\pi_j\) (for \(j \ge 2\)) and log-transform the \(\sigma_j\), such that \(\theta\) is partitioned into

p[ 1:(k-1)]:

p[j]\(= \mbox{logit}(\pi_{j+1})\) and \(\pi_1\) is given implicitly as \(\pi_1 = 1-\sum_{j=2}^k \pi_j\).

p[ k:(2k-1)]:

p[k-1+ j]\(= \mu_j\), for j=1:k.

p[2k:(3k-1)]:

p[2*k-1+ j] \(= \log(\sigma_j)\), i.e., \(\sigma_j^2 = exp(2*p[.+j])\).

See Also

norMix, logLik. Note that the log likelihood of a "norMix" object is directly given by sum(dnorMix(x, obj, log=TRUE)).

To fit, using the EM algorithm, rather use norMixEM() than the e.step, m.step, or em.step functions.

Note that direct likelihood maximization, i.e., MLE, is typically considerably more efficient than the EM, and typically converges well with our parametrization, see norMixMLE.

Examples

Run this code
# NOT RUN {
(obj <- MW.nm10) # "the Claw" -- m = 6 components
length(pp <- nM2par(obj)) # 17 == (3*6) - 1
par2norMix(pp)
## really the same as the initial 'obj' above

## Log likelihood (of very artificial data):
llnorMix(pp, x = seq(-2, 2, length=1000))
set.seed(47)## of more realistic data:
x <- rnorMix(1000, obj)
llnorMix(pp, x)

## Consistency check :  nM2par()  and  par2norMix()  are inverses
all.EQ <- function(x,y, tol = 1e-15, ...) all.equal(x,y, tolerance=tol, ...)
stopifnot(all.EQ(pp, nM2par(par2norMix(pp))),
          all.EQ(obj, par2norMix(nM2par(obj)),
                    check.attributes=FALSE),
          ## Direct computation of log-likelihood:
          all.EQ(sum(dnorMix(x, obj, log=TRUE)),
                    llnorMix(pp, x))  )

## E- and M- steps : ------------------------------
rE1 <- estep.nm(x, obj)
rE2 <- estep.nm(x, par=pp) # the same as rE1
z <- rE1
str( rM <-  mstep.nm(x, z))
   (rEM <- emstep.nm(x, obj))

stopifnot(all.EQ(rE1, rE2),
          all.EQ(rEM, do.call(norMix, c(rM, name=""))))
# }

Run the code above in your browser using DataLab