Learn R Programming

freqweights (version 0.0.1)

hclustvfreq: Fast hierarchical, agglomerative clustering of frequency data

Description

This function implements a version of the hierarchical, agglomerative clustering hclust.vector focused on table of frequencies.

Usage

hclustvfreq(data, freq = ~freq, method = "single", metric = "euclidean",
  p = NULL)

Arguments

data
any object that can be coerced into a double matrix
method
the agglomeration method to be used. This must be (an unambiguous abbreviation of) one of "single", "ward", "centroid" or "median".
freq
a one-sided, single term formula specifying frequency weights
metric
the distance measure to be used. This must be one of "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski"
p
parameter for the Minkowski metric.

Details

Any variables in the formula are removed from the data set.

This function is an adaptation of hclust.vector to be used with tables of frequencies.

See Also

hclust.vector, smartround

Examples

Run this code
library(dplyr)
library(fastcluster)

data <- iris[,1:3,drop=FALSE]
aa <- hclust.vector(data)
af <- hclustvfreq(data, freq=~1)
all.equal(af, aa) # Equals except in some fields

data$group <- cutree(aa,3)

tt <- tablefreq(data)
bb <- hclustvfreq(tt)
tt$group <- cutree(bb,3)

all.equal(unique(tt[,-ncol(tt)]),unique(data))

Run the code above in your browser using DataLab