Learn R Programming

greenclust (version 1.1.1)

greencut: Cut a Greenclust Tree into Optimal Groups

Description

Cuts a greenclust tree at an automatically-determined number of groups.

Usage

greencut(g, k = NULL, h = NULL)

Value

greencut returns a vector of group memberships, with the resulting r-squared value and p-value as object attributes, accessable via attr.

Arguments

g

a tree as producted by greenclust

k

an integer scalar with the desired number of groups

h

numeric scalar with the desired height where the tree should be cut

Details

The cut point is calculated by finding the number of groups/clusters that results in a collapsed contingency table with the most-significant (lowest p-value) chi-squared test. If there are ties, the smallest number of groups wins.

If a certain number of groups is required or a specific r-squared (1 - height) threshold is targeted, values for either k or h may be provided. (While the regular cutree function could also be used in this circumstance, it may still be useful to have the additional attributes that greencut() provides.)

As with cutree(), k overrides h if both are given.

References

Greenacre, M.J. (1988) "Clustering the Rows and Columns of a Contingency Table," Journal of Classification 5, 39-51. tools:::Rd_expr_doi("10.1007/BF01901670")

See Also

greenclust, greenplot, assign.cluster

Examples

Run this code
# Combine Titanic passenger attributes into a single category
# and create a contingency table for the non-zero levels
tab <- t(as.data.frame(apply(Titanic, 4:1, FUN=sum)))
tab <- tab[apply(tab, 1, sum) > 0, ]

grc <- greenclust(tab)
greencut(grc)

plot(grc)
rect.hclust(grc, max(greencut(grc)),
            border=unique(greencut(grc))+1)

Run the code above in your browser using DataLab