plot-methods: Rootogram of Posterior Probabilities

Description

The plot method for flexmix-class objects gives a rootogram or histogram of the posterior probabilities.

Usage

## S3 method for class 'flexmix,missing':
plot(x, y, mark=NULL, markcol="red",
  col = NULL, eps=1e-4, root=TRUE, ylim=TRUE, main=NULL, xlab="",
  ylab="", as.table=TRUE, endpoints=c(-0.4, 1.04),...)

Arguments

an object of class "flexmix"

not used

mark

integer, mark posteriors of this component

markcol

color used for marking components

col

color used for the bars

eps

posteriors smaller than eps are ignored

root

if TRUE, a rootogram of the posterior probabilities is drawn, otherwise a standard histogram

ylim

a logical value or a numeric vector of length n2. If TRUE, the y axes of all rootograms are aligned to have the same limits, if FALSE each y axis is scaled separately. If a numeric vector is specified it is used as us

main

main title of the plot

xlab

label of x-axis

ylab

label of y-axis

as.table

logical that controls the order in which panels should be plotted: if 'FALSE' (the default), panels are drawn left to right, bottom to top (as in a graph); if 'TRUE', left to right, top to bottom.

endpoints

vector of length 2 indicating the range of x-values that is to be covered by the histogram. This applies only when 'breaks' is unspecified. In 'do.breaks', this specifies the interval that is to be divided up.

...

further graphical parameters for the lattice function histogram

Details

For each mixture component a rootogram or histogram of the posterior probabilities of all observations is drawn. Rootograms are very similar to histograms, the only difference is that the height of the bars correspond to square roots of counts rather than the counts themselves, hence low counts are more visible and peaks less emphasized.

Usually in each component a lot of observations have posteriors close to zero, resulting in a high count for the corresponing bin in the rootogram which obscures the information in the other bins. To avoid this problem, all probabilities with a posterior below eps are ignored.

A peak at probability one indicates that a mixture component is well seperated from the other components, while no peak at one and/or significant mass in the middle of the unit interval indicates overlap with other components.

References

Friedrich Leisch. FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software, 11(8), 2004. http://www.jstatsoft.org/v11/i08/

Jeremy Tantrum, Alejandro Murua and Werner Stuetzle. Assessment and pruning of hierarchical model based clustering. Proceedings of the 9th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pages 197-205. ACM Press, New York, NY, USA, 2003.

Friedrich Leisch. Exploring the structure of mixture model components. In Jaromir Antoch, editor, Compstat 2004 - Proceedings in Computational Statistics, pages 1405-1412. Physika Verlag, Heidelberg, Germany, 2004. ISBN 3-7908-1554-3.