overlap: Overlap

Description

Computes misclassification probabilities and pairwise overlaps for finite mixture models with Gaussian components. Overlap is defined as sum of two misclassification probabilities.

Usage

overlap(Pi, Mu, S, eps = 1e-06, lim = 1e06)

Arguments

vector of mixing proprtions (length K).

matrix consisting of components' mean vectors (K * p).

set of components' covariance matrices (p * p * K).

eps

error bound for overlap computation.

lim

maximum number of integration terms (Davies, 1980).

Value

OmegaMapmatrix of misclassification probabilities (K * K); OmegaMap[i,j] is the probability that X coming from the i-th component is classified to the j-th component.
BarOmegavalue of average overap.
MaxOmegavalue of maximum overap.
rcMaxrow and column numbers for the pair of components producing maximum overlap 'MaxOmega'.

References

Maitra, R. and Melnykov, V. (2010) ``Simulating data to study performance of finite mixture modeling and clustering algorithms'', The Journal of Computational and Graphical Statistics, 2:19, 354-376.

Melnykov, V., Chen, W.-C., and Maitra, R. (2012) ``MixSim: An R Package for Simulating Data to Study Performance of Finite Mixture Modeling and Clustering Algorithms'', Journal of Statistical Software, (submitted).

Davies, R. (1980) ``The distribution of a linear combination of chi-square random variables'', Applied Statistics, 29, 323-333.

Examples

Run this code

data(iris)
p <- dim(iris)[2] - 1
K <- 3
id <- as.numeric(iris[, 5])

# estimate mixture parameters
Pi <- sapply(1:K, function(k){ sum(id == k) }) / dim(iris)[1]
Mu <- t(sapply(1:K, function(k){ colMeans(iris[id == k, -5]) }))
S <- sapply(1:K, function(k){ var(iris[id == k, -5]) })
dim(S) <- c(p, p, K)

overlap(Pi = Pi, Mu = Mu, S = S)