Learn R Programming

GSNA (version 0.1.4.2)

gsnHierarchicalDendrogram: gsnHierarchicalDendrogram

Description

Generate a dendrogram plot of a hierarchical clustered set of GSNA distances. This requires an embedded hierarchical cluster object of type 'hclust' associated with the default or specified distance metric. Such an object may be generated by running gsnPareNetGenericHierarchic() on a GSNData object prior to running this function.

The graphical output of this function can be a horizontal or circular dendrogram. When show.leaves, stat_col and optionally stat_col_2, the function will output a dendrogram image with leaves colored by the significance indicated in stat_col and optionally stat_col_2 (with a 1 or 2 dimensional color scale). If n_col is specified, the leaf sizes will be scaled by the column indicated therein.

The function has many optional arguments, but only a few should be necessary to get a decent plot.

Usage

gsnHierarchicalDendrogram(
  object,
  distance = NULL,
  subnet_colors = NULL,
  filename = NULL,
  file = NULL,
  out_format = NULL,
  width = NULL,
  height = NULL,
  .mai.plot = NULL,
  cex = par("cex"),
  subnetColorsFunction = gsnDendroSubnetColors_dark,
  id_col = NULL,
  id_nchar = NULL,
  pathways_title_col = c("Title", "Name", "NAME", "STANDARD_NAME"),
  substitute_id_col = NULL,
  font_face = NULL,
  color_labels_by = "subnet",
  show.leaves = FALSE,
  show.legend = TRUE,
  pathways_dat = NULL,
  stat_col = NULL,
  stat_col_2 = NULL,
  sig_order = NULL,
  sig_order_2 = NULL,
  n_col = NULL,
  transform_function = nzLog10,
  leaf_colors = c("white", "yellow", "red"),
  leaf_colors.1 = c("#FFFFFF", "red"),
  leaf_colors.2 = c("#FFFFFF", "blue"),
  leaf_border_color = "#666666",
  legend.leaf.col = "#CCCCCC",
  combine_method = "scaled_geomean",
  use_leaf_border = TRUE,
  render.plot = TRUE,
  c1.fun = NULL,
  c2.fun = NULL,
  geometry = "horizontal",
  .plt.plot = NULL,
  leaves_pch = NULL,
  leaf_char_shift = 1,
  na.color = "#CCCCCC",
  leaf_cex = NULL,
  leaf_cex_range = c(0.5, 2.1),
  lab.cex = NULL,
  tree_x_size.in = 2,
  legend_x_size.in = 2,
  left_margin.in = 0,
  right_margin.in = NULL,
  top_margin.in = NULL,
  bottom_margin.in = 0,
  legend.downshift.in = NULL,
  bkt_lmargin_chars = 4,
  legend_spacing.x.in = 2 * par("cin")[1],
  legend_spacing.y.in = par("cin")[2],
  legend.lab.cex = NULL,
  legend.axis.cex = NULL,
  legend.free.cex.bool = FALSE,
  main = NULL,
  cex.main = NULL,
  mar.main = 3.2,
  lines.main = 1.5,
  colors.n = 100,
  legend.bg = par("bg"),
  legend.fg = par("fg"),
  resolution = 72,
  draw.legend.box.bool = TRUE,
  DO_BROWSER = FALSE
)

Value

An object of type 'dendrogram', with the attribute "GSNA_plot_params" containing a list of plot parameters. This list is useful for retrieving plot parameters set by the function, so that they might be optimized. Likewise, the dendrogram object itself can be replotted or analyzed by other means.

Arguments

object

An object of the class GSNData

distance

(optional) A character vector of length one to indicate the desired distance metric to be used for generating a hierarchical dendrogram, e.g. 'lf', 'jaccard', 'stlf', etc. Defaults to the value of objects default_distance.

subnet_colors

(optional) A character vector of color codes matching the desired colors for subnets. If null then the colors are set automatically.

filename

(optional) A file for outputting a graphical image to a file as opposed to the current graphical device. Output format is automatically detected from the file suffix, but can be overridden using the out_format argument. (See details.)

file

(optional) Synonym of filename, but deprecated. (Generates a warning.)

out_format

(optional) File format of the output, either 'svg', 'png', 'pdf', or 'plot' (default if filename is not specified). For more information, see Details.

width

(optional) Used to specify the width of the output in inches. If not specified, defaults to the current figure width.

height

(optional) Used to specify the height of the output in inches. If not specified, defaults to the current figure height.

.mai.plot

(optional) A parameter specifying the margins of the plot, excluding legends as inches. This is calculated automatically and for most purposes, will not need to be specified.

cex

(optional) Font size in cex units. This parameter is used as a basis for setting the various other font sizes including those of leaf/node labels, cluster/subnet labels, and legend text sizes.

subnetColorsFunction

(optional) Function for assigning colors to subnets. Only used when color_labels_by == 'subnet'. The default value is gsnDendroSubnetColors_dark.

id_col

(optional) Character vector of length 1 indicating the name of the column to be used as an ID key in the pathways dataframe (or modules data if that is used, see below). This column should contain the same values as the names of the gene sets. This defaults to the value of the pathways id_col field.

id_nchar

(optional) Integer indicating the number of characters to reserve in the dendrogram plot for the ID. If unspecified, it is equal to the maximal nchar of the specified ID (id_col or substitute_id_col).

pathways_title_col

(optional) Character vector of length 1 indicating the name of the column in the pathways or modules data.frame to be used as a Title or descriptor in the plot. If not set the function looks for the following names: "Title", "Name", "NAME", "STANDARD_NAME", and takes the first that it finds. If set to NA, the title part of the label is suppressed.

substitute_id_col

(optional) Character vector of length 1 indicating a column used to substitute an alternative ID for the labeling gene sets in data set. If set to NA, the ID in the plot is disabled.

font_face

(optional) The font used for plot text, including leaf labels. For best results, this should be a monospaced font. If not specified, the system attempts to pick a suitable default: 'Andale Mono' on Mac OS X, 'Lucida Sans Typewriter' for Windows, and 'mono' for all other systems.

color_labels_by

(optional) This parameter tells the plotting function to assign colors to dendrogram leaf labels on on the basis of this argument. Currently, only 'subnets' and NULL are supported arguments.

show.leaves

(optional) Logical to tell the function to display leaves representing gene sets. When stat_col and optionally stat_col_2 are specified, naming parameters from the pathways_dat data.table, a single or two-color color scale is used to represent the value of the corresponding pathways statistics.

show.legend

(optional) A logical value telling the plotting function to include legends.(default: TRUE)

pathways_dat

(optional) data.frame containing associated pathways data. This defaults to whatever pathways data has already been imported into this GSNData object in object$pathways$data.

stat_col

(optional) This is the name of the column in the pathways data.frame that contains a significance value for coloring network vertices. The default value is specified by object$pathways$stat_col.

stat_col_2

(optional) This is the name of an optional second column in the pathways data.frame that contains a significance value for coloring network vertices in a 2-color network. The default value is specified by object$pathways$stat_col_2. When specified, a 2-color network is generated. To force a 2-color network to plot as a standard 1-color network using stat_col alone, use stat_col_2 = NA.

sig_order

(optional) This indicates the behavior of stat_col, whether low values ('loToHi') or high values ('hiToLo') are most significant. The default value is specified in object$pathways$sig_order.

sig_order_2

(optional) This indicates the behavior of stat_col, whether low values ('loToHi') or high values ('hiToLo') are most significant. The default value is specified in object$pathways$sig_order.

n_col

(optional) This is the name of the column in the pathways data.frame that contains a value for gene set size, or any other value intended to be the bases of leaf scaling. When specified, leaf sizes will be scaled by this value. (default is the value in object$pathways$n_col). An NA value can be used to override the the value in object$pathways$n_col and suppress leaf scaling.

transform_function

(optional) Function to transform significance values for conversion to a color scale. Normally, significance values are p-values, and need log transformation. If there are significance values of 0, these are converted to -Inf by log-transformation, so the function nzLog10() adds a small pseudocount to the values to mitigate this problem, prior to log10 transformation, but for other types of data, other transformations or even 'identity' may be more suitable. (default, nzLog10)

leaf_colors

(optional) A vector containing at least 2 colors for generating a color gradient in single channel visualizations. (default: c("white","yellow","red"), see details)

leaf_colors.1

(optional) A vector containing at least 2 colors for generating a color gradient in dual channel visualizations. (default: c("white", "red"), see details)

leaf_colors.2

(optional) A vector containing at least 2 colors for generating a color gradient in dual channel visualizations. (default: c("white", "blue"), see details)

leaf_border_color

(optional) For R's open plot symbols pch \(\in\) ( 21, 22, 23, 24, 25 ), supporting fill with a 'bg' color, leaf border may be specified with this option. (default: "#666666")

legend.leaf.col

(optional) Leaf fill color for the legend. (default: "#CCCCCC")

combine_method

(optional) For dual channel plots this is a string used to indicate how colors are combined to generate a two dimensional color scale. Options are "scaled_geomean" (same as "default"), "standard" (same as "euclidean" ), "negative_euclidean", "mean", and "additive". See details.

use_leaf_border

(optional) When automatically choosing a leaf symbol (leaves_pch), this option determines whether a solid or an open symbol is used (see details).

render.plot

(option) Logical value indicating whether to actually render the plot, or simply return a dendrogram. This may be useful if graphical parameters need to be calculated but rendering is not desired. (see value)

c1.fun

(optional) Function to convert the vector of numeric values represented by stat_col to a character vector corresponding to colors. For dual channel plots, these colors may be combined with a second array of colors using by the method specified using the combine_method parameter. If not specified, c1.fun calculated automatically as a linear function.

c2.fun

(optional) Same as c1.fun but for stat_col_2.

geometry

(optional) Specifies either "horizontal" or "circular" type dendrogram plots. (default: horizontal)

.plt.plot

(optional) Specifies the plot region of the output using figure coordinates, and excluding the legends. This can provide a greater degree of control for plotting, but most users will not need to adjust this. See the plt argument of the par graphics function for more information.

leaves_pch

(optional) Used to specify the pch symbol used to represent dendrogram leaves. (default: 22 (open square), for horizontal dendrograms and dendextend version >= '1.16.0'; 15 (solid square) for horizontal dendrograms with dendextend version < '1.16.0', and for circular dendrograms, 16 (solid circle))

leaf_char_shift

(optional) A parameter telling the function by how many character widths to shift the leaf labels. (default: 1)

na.color

(optional) The color used for NA values. (default: "#CCCCCC")

leaf_cex

(optional) The cex size of the leaf symbols. This is used when n_col is not specified, i.e. there is no leaf size scaling. (default: 1.5 * lab.cex)

leaf_cex_range

(optional) The range of leaf sizes used in plots, from low to high. This is used when n_col is specified and leaf sizes are to be scaled. This may need to be reduced if leaves overlap or are clipped on one size. (default: c(0.5, 2.1))

lab.cex

(optional) The cex size of dendrogram leaf labels (default: 0.9 * cex).

tree_x_size.in

(optional) For horizontal dendrograms, this is the width of the dendrogram in inches, not including leaf labels, cluster brackets, or legends. (default: 2)

legend_x_size.in

(optional) The width of legends in inches. (default: 2)

left_margin.in

(optional) The width of the left margin in inches. Ignored if .plt.plot or .mai.plot is specified. (default: 0)

right_margin.in

(optional) The width of the right margin of the dendrogram in inches. Ignored if .plt.plot or .mai.plot is specified. If unspecified, this is calculated automatically as width - tree_x_size.in.

top_margin.in

(optional) The width of the top margin of the dendrogram in inches. Ignored if .plt.plot or .mai.plot is specified. (default: if no main argument is specified, 0. If a main argument is specified, then it is calculated as cex.main * par('cin')[2] * mar.main)

bottom_margin.in

(optional) (optional) The width of the bottom margin in inches. Ignored if .plt.plot or .mai.plot is specified. (default: 0)

legend.downshift.in

(optional) Argument shifting the legend downward, in inches. This is useful for adjusting the alignment of the legend(s) with the top of the plot. (default: for horizontal dendrograms, 0; for circular dendrograms, 0.42)

bkt_lmargin_chars

(optional) Width in character widths of the space between the leaf labels and the brackets indicating cluster/subnet groups. If the leaf labels need more space, this can be increased. (default: 4)

legend_spacing.x.in

(optional) Space between plot and legend in inches. With some plot configurations, it may be useful to use negative values to bring the legends closer to the plot region. (default: 2 character widths)

legend_spacing.y.in

(optional) Space between legends in inches. (default: 1 character height)

legend.lab.cex

(optional) Legend x and y label size in cex. If unspecified, the function tries to pick a reasonable value based on available space.

legend.axis.cex

(optional) Legend axis label size in cex. If unspecified, the function tries to pick a reasonable value based on available space.

legend.free.cex.bool

(optional) Logical allowing independent optimized sizing of legend label font sizes if TRUE. (default: FALSE)

main

(optional) Legend main title. (default: NULL)

cex.main

(optional) Font size in cex units for the main title. (default: 1.35 * cex)

mar.main

(optional) Tells the function to reserve this many line heights for the main title. (default: 3.2)

lines.main

(optional) Tells the function to place the main title this many lines away from the plot edge. (default: 1.5)

colors.n

(optional) The number of colors per dimension of the color scale. For single channel plots, this will be equal to the number of colors in the color scale. For 2 channel plots, the number of colors is the square of this number. (default 100).

legend.bg

(option) The color of the legend background. (default: par('bg'))

legend.fg

(option) The color of the legend foreground. (default: par('fg'))

resolution

Image resolution in pixels per inch, only for bitmap image output formats (currently png only). (default: 72)

draw.legend.box.bool

(option) Logical indicating whether bounding boxes should be drawn for the legends.

DO_BROWSER

(option) Logical indicating whether browser() should be run for this function. (For debugging purposes, will probably remove.)

Details

Outputs of type pdf, png, and svg are supported for file outputs. File type is automatically detected from the file suffix, but can be overridden using the out_format argument.

Open symbols (with border and a fill color, pch \(\in\) ( 21, 22, 23, 24, 25 )) are used by default on dendextend versions < '1.16.0' for horizontal dendrograms. For earlier versions, and with circular dendrograms, open symbols are currently unsupported.

See Also

gsnPareNetGenericHierarchic gsnPlotNetwork