overlap: Overlap

Description

Computes misclassification probabilities and pairwise overlaps for finite mixture models with Gaussian components. Overlap is defined as sum of two misclassification probabilities.

Usage

overlap(Pi, Mu, S, eps = 1e-06, lim = 1e06)

Value

OmegaMap: matrix of misclassification probabilities (K * K); OmegaMap[i,j] is the probability that X coming from the i-th component is classified to the j-th component.
BarOmega: value of average overlap.
MaxOmega: value of maximum overlap.
rcMax: row and column numbers for the pair of components producing maximum overlap 'MaxOmega'.

Arguments

Pi: vector of mixing proprtions (length K).
Mu: matrix consisting of components' mean vectors (K * p).
S: set of components' covariance matrices (p * p * K).
eps: error bound for overlap computation.
lim: maximum number of integration terms (Davies, 1980).

Author

Volodymyr Melnykov, Wei-Chen Chen, and Ranjan Maitra.

References

Maitra, R. and Melnykov, V. (2010) ``Simulating data to study performance of finite mixture modeling and clustering algorithms'', The Journal of Computational and Graphical Statistics, 2:19, 354-376.

Melnykov, V., Chen, W.-C., and Maitra, R. (2012) ``MixSim: An R Package for Simulating Data to Study Performance of Clustering Algorithms'', Journal of Statistical Software, 51:12, 1-25.

Davies, R. (1980) ``The distribution of a linear combination of chi-square random variables'', Applied Statistics, 29, 323-333.

Examples

Run this code


data("iris", package = "datasets")
p <- ncol(iris) - 1
id <- as.integer(iris[, 5])
K <- max(id)

# estimate mixture parameters
Pi <- prop.table(tabulate(id))
Mu <- t(sapply(1:K, function(k){ colMeans(iris[id == k, -5]) }))
S <- sapply(1:K, function(k){ var(iris[id == k, -5]) })
dim(S) <- c(p, p, K)

overlap(Pi = Pi, Mu = Mu, S = S)

Run the code above in your browser using DataLab