clusGapDiscr: Discrete application of clusGap

Description

Based on the implementation of the function found in the `cluster` R package.

Usage

clusGapDiscr(
  x,
  clusterFUN,
  K.max,
  B = nrow(x),
  value.range = "DS",
  verbose = interactive(),
  distName = "hamming",
  useLog = TRUE,
  ...
)

Value

a matrix with K.max rows and 4 columns, named "logW", "E.logW", "gap", and "SE.sim", where gap = E.logW - logW, and SE.sim correspond to the standard error of `gap`.

Arguments

x: A matrix object specifying category attributes in the columns and observations in the rows.
clusterFUN: Character string with one of the available clustering implementations. Available options are: 'pam' (default) from `cluster::pam`, 'diana' from `cluster::diana`, 'fanny' from `cluster::fanny`, 'agnes-{average, single, complete, ward, weighted}' from `cluster::fanny`, 'hclust-{ward.D, ward.D2, single, complete, average, mcquitty, median, centroid}' from `stats::hclust`, 'kmodes' from `klar::kmodes` (`iter.max = 10`, `weighted = FALSE` and `fast= TRUE`). 'kmodes-N' enables to run the `kmodes` algorithm with a given number N of iterations where `iter.max = N`.
K.max: Integer. Maximum number of clusters `k` to consider
B: Number of bootstrap samples. By default B = nrow(x).
value.range: String character vector or a list of character vector with the length matching the number of columns (nQ) of the array. A vector with all categories to consider when bootstrapping the null distribution sample (KS: Known Support option). By DEFAULT vals=NULL, meaning unique range of categories found in the data will be used when drawing the null (DS: Data Support option). If a character vector of categories is provided, these values would be used for the null distribution drawing across the array. If a list with category character vectors is provided, it has to have the same number of columns as the input array. The order of list element corresponds to the array's columns.
verbose: Integer or logical. Determines whether progress output should printed while running. By DEFAULT one bit is printed per bootstrap sample.
distName: String. Name of categorical distance to apply. Available distances: 'bhattacharyya', 'chisquare', 'cramerV', 'hamming' and 'hellinger'.
useLog: Logical. Use log function after estimating `W.k`. Following the original formulation `useLog=TRUE` by default.
...: optionally further arguments for `FUNcluster()`