This function aligns observations within the layout according to a hierarchical clustering tree, enabling reordering or grouping of elements based on clustering results.
align_dendro(
mapping = aes(),
...,
distance = "euclidean",
method = "complete",
use_missing = "pairwise.complete.obs",
reorder_dendrogram = FALSE,
merge_dendrogram = FALSE,
reorder_group = FALSE,
k = NULL,
h = NULL,
cutree = NULL,
plot_dendrogram = TRUE,
plot_cut_height = NULL,
root = NULL,
center = FALSE,
type = "rectangle",
size = NULL,
data = NULL,
no_axes = NULL,
active = NULL,
free_guides = deprecated(),
free_spaces = deprecated(),
plot_data = deprecated(),
theme = deprecated(),
free_labs = deprecated(),
set_context = deprecated(),
order = deprecated(),
name = deprecated()
)A "AlignDendro" object.
Default list of aesthetic mappings to use for plot. If not specified, must be supplied in each layer added to the plot.
<dyn-dots> Additional arguments passed to
geom_segment().
A string of distance measure to be used. This must be one of
"euclidean", "maximum", "manhattan", "canberra", "binary" or
"minkowski". Correlation coefficient can be also used, including
"pearson", "spearman" or "kendall". In this way, 1 - cor will be used
as the distance. In addition, you can also provide a dist
object directly or a function return a dist object. Use
NULL, if you don't want to calculate the distance.
A string of the agglomeration method to be used. This should be
(an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single",
"complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (=
WPGMC) or "centroid" (= UPGMC). You can also provide a function which
accepts the calculated distance (or the input matrix if distance is NULL)
and returns a hclust object. Alternative, you can supply
an object which can be coerced to hclust.
An optional character string giving a method for computing
covariances in the presence of missing values. This must be (an abbreviation
of) one of the strings "everything", "all.obs", "complete.obs",
"na.or.complete", or "pairwise.complete.obs". Only used when distance
is a correlation coefficient string.
A single boolean value indicating whether to
reorder the dendrogram based on the means. Alternatively, you can provide a
custom function that accepts an hclust object and the data
used to generate the tree, returning either an hclust or
dendrogram object. Default is FALSE.
A single boolean value, indicates whether we should
merge multiple dendrograms, only used when previous groups have been
established. Default: FALSE.
A single boolean value, indicates whether we should do
Hierarchical Clustering between groups, only used when previous groups have
been established. Default: FALSE.
An integer scalar indicates the desired number of groups.
A numeric scalar indicates heights where the tree should be cut.
A function used to cut the hclust tree. It
should accept four arguments: the hclust tree object,
distance (only applicable when method is a string or a function for
performing hierarchical clustering), k (the number of clusters), and h (the
height at which to cut the tree). By default, cutree()
is used.
A boolean value indicates whether plot the dendrogram tree.
A boolean value indicates whether plot the cut height.
A length one string or numeric indicates the root branch.
A boolean value. if TRUE, nodes are plotted centered with
respect to the leaves in the branch. Otherwise (default), plot them in the
middle of all direct child nodes.
A string indicates the plot type, "rectangle" or "triangle".
The relative size of the plot, can be specified as a
unit.
A matrix-like object. By default, it inherits from the layout
matrix.
Logical; if
TRUE,
removes axes elements for the alignment axis using theme_no_axes(). By
default, will controled by the option-
"ggalign.align_no_axes".
A active() object that defines the context settings when
added to a layout.
align_dendro initializes a ggplot data and mapping.
The internal will always use a default mapping of aes(x = .data$x, y = .data$y).
The default ggplot data is the node coordinates with edge data attached
in ggalign attribute, in addition, a
geom_segment layer with a data of the edge
coordinates will be added.
node and tree segments edge coordinates contains following columns:
index: the original index in the tree for the current node
label: node label text
x and y: x-axis and y-axis coordinates for current node or the start
node of the current edge.
xend and yend: the x-axis and y-axis coordinates of the terminal node
for current edge.
branch: which branch current node or edge is. You can use this column
to color different groups.
panel: which panel current node is, if we split the plot into panel
using facet_grid, this column will show
which panel current node or edge is from. Note: some nodes may
fall outside panel (between two panel), so there are possible
NA values in this column.
.panel: Similar with panel column, but always give the correct branch
for usage of the ggplot facet.
panel1 and panel2: The panel1 and panel2 variables have the same
functionality as panel, but they are specifically for the edge data
and correspond to both nodes of each edge.
leaf: A logical value indicates whether current node is a leaf.
It is important to note that we consider rows as observations, meaning
vec_size(data)/NROW(data) must match the number of observations along the
axis used for alignment (x-axis for a vertical stack layout, y-axis for a
horizontal stack layout).
quad_layout()/ggheatmap(): For column annotation, the layout
matrix will be transposed before use (if data is a function, it is
applied to the transposed matrix), as column annotation uses columns as
observations but alignment requires rows.
stack_layout(): The layout matrix is used as is, aligning all plots
along a single axis.
dendrogram_data()
hclust2()
ggheatmap(matrix(rnorm(81), nrow = 9)) +
anno_top() +
align_dendro()
ggheatmap(matrix(rnorm(81), nrow = 9)) +
anno_top() +
align_dendro(k = 3L)
Run the code above in your browser using DataLab