# GMfj

##### Multithreaded minimum message length Gaussian mixture trainer

Figueiredo and Jain's Gaussian mixture trainer with all options in `GM()`

.

##### Usage

```
GMfj(
X,
Xw = rep(1, ncol(X)),
alpha = numeric(0),
mu = matrix(ncol = 0, nrow = 0),
sigma = matrix(ncol = 0, nrow = 0),
G = 5L,
Gmin = 2L,
convergenceEPS = 1e-05,
alphaEPS = 0,
eigenRatioLim = Inf,
maxIter = 1000L,
maxCore = 7L,
tlimit = 3600,
verbose = TRUE
)
```

##### Arguments

- X
A

`d x N`

numeric matrix where`N`

is the number of observations --- each column is an observation, and`d`

is the dimensionality. Column-observation representation promotes cache locality.- Xw
A numeric vector of size

`N`

.`Xw[i]`

is the weight on observation`X[, i]`

. Users should normalize`Xw`

such that the elements sum up to`N`

. Default uniform weights for all observations.- alpha
A numeric vector of size

`K`

, the number of Gaussian kernels in the mixture model.`alpha`

are the initial mixture weights and should sum up to 1. Default empty.- mu
A

`d x K`

numeric matrix.`mu[, i]`

is the initial mean for the`i`

th Gaussian kernel. Default empty.- sigma
A

`d^2 x K`

numeric matrix. Each column represents a flattened`d x d`

initial covariance matrix of the`i`

th Gaussian kernel. In R,`as.numeric(aMatrix)`

gives the flattened version of`aMatrix`

. Covariance matrix of each Gaussian kernel MUST be positive-definite. Default empty.- G
An integer. If at least one of the parameters

`alpha`

,`mu`

,`sigma`

are empty, the program will initialize`G`

Gaussian kernels via K-means++ deterministic initialization. See`KMppIni()`

. Otherwise`G`

is ignored. Default 5.- Gmin
An integer. The final model should have at least

`Gmin`

kernels.- convergenceEPS
A numeric value. If the average change of all parameters in the mixture model is below

`convergenceEPS`

relative to those in the pervious iteration, the program ends. Checking convergence this way is faster than recomputing the log-likelihood every iteration. Default 1e-5.- alphaEPS
A numeric value. During training, if any Gaussian kernel's weight is no greater than

`alphaEPS`

, the kernel is deleted. Default 0.- eigenRatioLim
A numeric value. During training, if any Gaussian kernel's max:min eigen value ratio exceeds

`eigenRatioLim`

, the kernel is treated as degenerate and deleted. Thresholding eigen ratios is in the interest of minimizing the effect of degenerate kernels in an early stage. Default`Inf`

.- maxIter
An integer, the maximal number of iterations.

- maxCore
An integer. The maximal number of threads to invoke. Should be no more than the total number of logical processors on machine. Default 7.

- tlimit
A numeric value. The program exits with the current model in

`tlimit`

seconds.- verbose
A boolean value.

`TRUE`

prints progress.

##### Details

Although heavily cited, the paper has some misleading information and the algorithm's performance does not live up to its reputation. See <https://stats.stackexchange.com/questions/423935/figueiredo-and-jains-gaussian-mixture-em-convergence-criterion>. Nevertheless, it is a worthwhile algorithm to try in practice.

##### Value

A list of size 5:

a numeric vector of size `K`

. The mixture weights.

a `d x K`

numeric matrix. Each column is the mean of a Gaussian kernel.

a `d^2 x K`

numeric matrix. Each column is the flattened covariance matrix of a Gaussian kernel. Do `matrix(sigma[, i], nrow = d)`

to recover the covariance matrix of the `i`

th kernel.

a numeric vector of size `N`

. `fitted[i]`

is the probability density of the `i`

th observation given by the mixture model.

a list of `K`

integer vectors, the hard clustering inferred from the mixture model. Each integer vector contains the indexes of observations in `X`

.

##### Warning

For one-dimensional data, `X`

should still follow the data structure requirements: a matrix where each column is an observation.

##### References

Mario A.T. Figueiredo & Anil K. Jain (2002): "Unsupervised learning of finite mixture models." IEEE Transactions on Pattern Analysis and Machine Intelligence 24(3): 381-396.

##### Examples

```
# NOT RUN {
# =============================================================================
# Examples below use 1 thread to pass CRAN check. Speed advantage of multiple
# threads will be more pronounced for larger data.
# =============================================================================
# =============================================================================
# Parameterize the iris data. Let the function initialize Gaussian kernels.
# =============================================================================
X = t(iris[1:4])
# CRAN check only allows 2 threads at most. Increase `maxCore` for
# acceleration.
system.time({gmmRst = GMKMcharlie::GMfj(
X, G = 25L, Gmin = 2L, maxCore = 1L, verbose = FALSE)})
str(gmmRst)
# =============================================================================
# Parameterize the iris data given Gaussian kernels.
# =============================================================================
G = 25L
d = nrow(X) # Dimensionality.
alpha = rep(1, G) / G
mu = X[, sample(ncol(X), G)] # Sample observations as initial means.
# Take the average variance and create initial covariance matrices.
meanVarOfEachDim = sum(diag(var(t(X)))) / d
covar = diag(meanVarOfEachDim / G, d)
covars = matrix(rep(as.numeric(covar), G), nrow = d * d)
# Models are sensitive to initialization.
system.time({gmmRst2 = GMKMcharlie::GMfj(
X, alpha = alpha, mu = mu, sigma = covars, maxCore = 1L, verbose = FALSE)})
str(gmmRst2)
# =============================================================================
# For fun, fit Rosenbrock function with a Gaussian mixture.
# =============================================================================
set.seed(123)
rosenbrock <- function(x, y) {(1 - x) ^ 2 + 100 * (y - x ^ 2) ^ 2}
N = 2000L
x = runif(N, -2, 2)
y = runif(N, -1, 3)
z = rosenbrock(x, y)
X = rbind(x, y)
Xw = z * (N / sum(z)) # Weights on observations should sum up to N.
system.time({gmmFit = GMKMcharlie::GMfj(
X, Xw = Xw, G = 5L, maxCore = 1L, verbose = FALSE)})
oldpar = par()$mfrow
par(mfrow = c(1, 2))
plot3D::points3D(x, y, z, pch = 20)
plot3D::points3D(x, y, gmmFit$fitted, pch = 20)
par(mfrow = oldpar)
# =============================================================================
# For fun, fit a 3D spiral distribution.
# =============================================================================
N = 2000
t = runif(N) ^ 2 * 15
x = cos(t) + rnorm(N) * 0.1
y = sin(t) + rnorm(N) * 0.1
z = t + rnorm(N) * 0.1
X = rbind(x, y, z)
d = 3L
G = 10L
system.time({gmmFit = GMKMcharlie::GMfj(
X, G = G, maxCore = 1L, verbose = FALSE)})
# Sample N points from the Gaussian mixture.
ns = as.integer(round(N * gmmFit$alpha))
sampledPoints = list()
for(i in 1L : G)
{
sampledPoints[[i]] = MASS::mvrnorm(
ns[i], mu = gmmFit$mu[, i], Sigma = matrix(gmmFit$sigma[, i], nrow = d))
}
sampledPoints =
matrix(unlist(lapply(sampledPoints, function(x) t(x))), nrow = d)
# Plot the original data and the samples from the mixture model.
oldpar = par()$mfrow
par(mfrow = c(1, 2))
plot3D::points3D(x, y, z, pch = 20)
plot3D::points3D(x = sampledPoints[1, ],
y = sampledPoints[2, ],
z = sampledPoints[3, ], pch = 20)
par(mfrow = oldpar)
# }
```

*Documentation reproduced from package GMKMcharlie, version 1.0.3, License: GPL-3*