hclustvfreq: Fast hierarchical, agglomerative clustering of frequency data

Description

This function implements a version of the hierarchical, agglomerative clustering hclust.vector focused on table of frequencies.

Usage

hclustvfreq(data, freq = NULL, method = "single", metric = "euclidean",
  p = NULL)
.hclustvfreq(tfq, method = "single", metric = "euclidean", p = NULL)

Arguments

data

any object that can be coerced into a double matrix

method

the agglomeration method to be used. This must be (an unambiguous abbreviation of) one of "single", "ward", "centroid" or "median".

freq

a one-sided, single term formula specifying frequency weights

metric

the distance measure to be used. This must be one of "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski"

parameter for the Minkowski metric.

tfq

a frequency table

Details

Any variables in the formula are removed from the data set.

This function is a wrapper of hclust.vector to be used with tables of frequencies. It use the frequency weights as parameter members.

Examples

Run this code

library(dplyr)
library(fastcluster)

data <- iris[,1:3,drop=FALSE]
hc <- hclustvfreq(data, method="centroid",metric="euclidean")
cutree(hc,3) ## Different length than data

tfq <- tablefreq(iris[,1:3])
hc <- .hclustvfreq(tfq, method="centroid",metric="euclidean")
tfq$group <- cutree(hc,3)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Details

See Also

Examples