broom (version 0.4.4)

mclust_tidiers: Tidying methods for Mclust objects

Description

These methods summarize the results of Mclust clustering into three tidy forms. tidy describes the size, mixing probability, mean and variabilty of each class, augment adds the class assignments and their probabilities to the original data, and glance summarizes the model parameters of the clustering.

Usage

# S3 method for Mclust
tidy(x, ...)

# S3 method for Mclust augment(x, data, ...)

# S3 method for Mclust glance(x, ...)

Arguments

x

Mclust object

...

extra arguments, not used

data

Original data (required for augment)

Value

All tidying methods return a data.frame without rownames, whose structure depends on the method chosen.

tidy returns one row per component, with

component

A factor describing the cluster from 1:k (or 0:k in presence of a noise term in x)

size

The size of each component

proportion

The mixing proportion of each component

variance

In case of one-dimensional and spherical models, the variance for each component, omitted otherwise. NA for noise component

mean

The mean for each component. In case of two- or more dimensional models, a column with the mean is added for each dimension. NA for noise component

augment returns the original data with two extra columns:

.class

The class assigned by the Mclust algorithm

.uncertainty

The uncertainty associated with the classification

glance returns a one-row data.frame with the columns

model

A character string denoting the model at which the optimal BIC occurs

n

The number of observations in the data

G

The optimal number of mixture components

BIC

The optimal BIC value

logLik

The log-likelihood corresponding to the optimal BIC

df

The number of estimated parameters

hypvol

The hypervolume parameter for the noise component if required, otherwise set to NA

See Also

Mclust

Examples

Run this code
# NOT RUN {
library(dplyr)
library(ggplot2)
library(mclust)

set.seed(2016)
centers <- data.frame(cluster=factor(1:3), size=c(100, 150, 50),
                      x1=c(5, 0, -3), x2=c(-1, 1, -2))
points <- centers %>% group_by(cluster) %>%
 do(data.frame(x1=rnorm(.$size[1], .$x1[1]),
               x2=rnorm(.$size[1], .$x2[1]))) %>%
 ungroup()

m = Mclust(points %>% dplyr::select(x1, x2))

tidy(m)
head(augment(m, points))
glance(m)

# }

Run the code above in your browser using DataCamp Workspace