broom (version 0.4.1)

mclust_tidiers: Tidying methods for Mclust objects

Description

These methods summarize the results of Mclust clustering into three tidy forms. tidy describes the size, mixing probability, mean and variabilty of each class, augment adds the class assignments and their probabilities to the original data, and glance summarizes the model parameters of the clustering.

Usage

"tidy"(x, ...)
"augment"(x, data, ...)
"glance"(x, ...)

Arguments

x
Mclust object
...
extra arguments, not used
data
Original data (required for augment)

Value

All tidying methods return a data.frame without rownames, whose structure depends on the method chosen.tidy returns one row per component, with
component
A factor describing the cluster from 1:k (or 0:k in presence of a noise term in x)
size
The size of each component
proportion
The mixing proportion of each component
variance
In case of one-dimensional and spherical models, the variance for each component, omitted otherwise. NA for noise component
mean
The mean for each component. In case of two- or more dimensional models, a column with the mean is added for each dimension. NA for noise component
augment returns the original data with two extra columns:
.class
The class assigned by the Mclust algorithm
.uncertainty
The uncertainty associated with the classification
glance returns a one-row data.frame with the columns
model
A character string denoting the model at which the optimal BIC occurs
n
The number of observations in the data
G
The optimal number of mixture components
BIC
The optimal BIC value
logLik
The log-likelihood corresponding to the optimal BIC
df
The number of estimated parameters
hypvol
The hypervolume parameter for the noise component if required, otherwise set to NA

See Also

Mclust

Examples

Run this code

library(dplyr)
library(ggplot2)
library(mclust)

set.seed(2016)
centers <- data.frame(cluster=factor(1:3), size=c(100, 150, 50),
                      x1=c(5, 0, -3), x2=c(-1, 1, -2))
points <- centers %>% group_by(cluster) %>%
 do(data.frame(x1=rnorm(.$size[1], .$x1[1]),
               x2=rnorm(.$size[1], .$x2[1]))) %>%
 ungroup()

m = Mclust(points %>% dplyr::select(x1, x2))

tidy(m)
head(augment(m, points))
glance(m)

Run the code above in your browser using DataLab