Learn R Programming

freqweights (version 0.1.0)

hclustvfreq: Fast hierarchical, agglomerative clustering of frequency data

Description

This function implements a version of the hierarchical, agglomerative clustering hclust.vector focused on table of frequencies.

Usage

hclustvfreq(data, freq = NULL, method = "single", metric = "euclidean",
  p = NULL)

.hclustvfreq(tfq, method = "single", metric = "euclidean", p = NULL)

Arguments

data
any object that can be coerced into a double matrix
method
the agglomeration method to be used. This must be (an unambiguous abbreviation of) one of "single", "ward", "centroid" or "median".
freq
a one-sided, single term formula specifying frequency weights
metric
the distance measure to be used. This must be one of "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski"
p
parameter for the Minkowski metric.
tfq
a frequency table

Details

Any variables in the formula are removed from the data set.

This function is a wrapper of hclust.vector to be used with tables of frequencies. It use the frequency weights as parameter members.

See Also

hclust.vector, link{tablefreq}

Examples

Run this code
library(dplyr)
library(fastcluster)

data <- iris[,1:3,drop=FALSE]
hc <- hclustvfreq(data, method="centroid",metric="euclidean")
cutree(hc,3) ## Different length than data

tfq <- tablefreq(iris[,1:3])
hc <- .hclustvfreq(tfq, method="centroid",metric="euclidean")
tfq$group <- cutree(hc,3)

Run the code above in your browser using DataLab