Learn R Programming

DiscreteGapStatistic (version 1.1.2)

ResHeatmap: Discrete Data Heatmap

Description

Heatmap assuming a given a distance function and a known number of clusters. Function to display a categorical data matrix given a user defined number of clusters `nCl`, a categorical distance `distName` and a predefined clustering method `FUNcluster`. The output displays a heatmap separating and color-labelling resulting clusters vertically in the rows and allowing unsupervised clustering on questions in the columns. Each cell is colored according to the categorical values provided or found in the data. The clustergram is based on the `pheatmap` function from the pheatmap R package. Thus, any parameter found in pheatmap can be specified to `clusGapDiscrHeat`. This function can be used to examine number of clusters before running `clusGapDiscrHeat` but also after the number of clusters is determined.

Usage

ResHeatmap(
  x,
  nCl,
  distName,
  catVals,
  clusterFUN,
  out = "heatmap",
  seed = NULL,
  clusterNames = NULL,
  prefObs = NULL,
  rowNames = rownames(x),
  filename = NULL,
  outDir = NULL,
  height = 10,
  width = 6
)

Value

png file or ComplexHeatmap object

Arguments

x

matrix object or data.frame

nCl

number of clusters to plot; if `nCl` is a permutation vector of the first lN integers will rearrange clusters according to the original given ordering.

distName

Name of categorical distance to apply. Available distances: 'bhattacharyya', 'chisquare', 'cramerV', 'hamming' and 'hellinger'.

catVals

character string vector with (ordered) categorical values

clusterFUN

Character string with one of the available clustering implementations. Available options are: 'pam' (default) from `cluster::pam`, 'diana' from `cluster::diana`, 'fanny' from `cluster::fanny`. 'agnes-{average, single, complete, ward, weighted}' from `cluster::agnes`, 'hclust-{ward.D, ward.D2, single, complete, average, mcquitty, median, centroid}' from `stats::hclust`, 'kmodes' from `klar::kmodes` (`weighted = FALSE` and `fast= TRUE`).

out

Specifies the desired output between "heatmap" (default; produce a heatmap), "clusters" (return a `data.frame` with clustering assignments) or "clustersReord" (return a `data.frame` with reorganized clusters)

seed

Seed number.

clusterNames

Either `null` or 'renumber'. When `nCl` is a numerical vector, the cluster ordering is rearranged. `NULL` leaves cluster names as their original cluster assignment. 'renumber' respects the rearrangements but relabels the cluster numbers from top to bottom in ascending order.

prefObs

character string vector of length 1 with a prefix for the observations, in case they come unlabelled or the user wants to anomymize sample IDs.

rowNames

character vector with names of rows according to `x`. By default, `rownames(x)` will be printed in the plot. `rowNames=NULL` prevents from showing names. `prefObs` option takes precedence if is different to `NULL`.

filename

character string with name of file output

outDir

character string with the directory path to save output file

height

numeric height of output plot in inches

width

numeric width of output plot in inches