Learn R Programming

ggstatsplot (version 0.3.1)

ggcorrmat: Visualization of a correlation matrix

Description

Visualization of a correlation matrix

Usage

ggcorrmat(
  data,
  cor.vars = NULL,
  cor.vars.names = NULL,
  output = "plot",
  matrix.type = "full",
  method = "square",
  type = "parametric",
  beta = 0.1,
  k = 2,
  sig.level = 0.05,
  conf.level = 0.95,
  p.adjust.method = "none",
  pch = 4,
  ggcorrplot.args = list(outline.color = "black"),
  package = "RColorBrewer",
  palette = "Dark2",
  direction = 1,
  colors = c("#E69F00", "white", "#009E73"),
  ggtheme = ggplot2::theme_bw(),
  ggstatsplot.layer = TRUE,
  title = NULL,
  subtitle = NULL,
  caption = NULL,
  caption.default = TRUE,
  messages = TRUE,
  ...
)

Arguments

data

Dataframe from which variables specified are preferentially to be taken.

cor.vars

List of variables for which the correlation matrix is to be computed and visualized. If NULL (default), all numeric variables from data will be used.

cor.vars.names

Optional list of names to be used for cor.vars. The names should be entered in the same order.

output

Character that decides expected output from this function: "plot" (for visualization matrix) or "correlations" (or "corr" or "r"; for correlation matrix) or "p-values" (or "p.values" or "p"; for a matrix of p-values) or "ci" (for a tibble with confidence intervals for unique correlation pairs; not available for robust correlation) or "n" (or "sample.size" for a tibble with sample sizes for each correlation pair).

matrix.type

Character, "full" (default), "upper" or "lower", display full matrix, lower triangular or upper triangular matrix.

method

Smoothing method (function) to use, accepts either NULL or a character vector, e.g. "lm", "glm", "gam", "loess" or a function, e.g. MASS::rlm or mgcv::gam, stats::lm, or stats::loess. "auto" is also accepted for backwards compatibility. It is equivalent to NULL.

For method = NULL the smoothing method is chosen based on the size of the largest group (across all panels). stats::loess() is used for less than 1,000 observations; otherwise mgcv::gam() is used with formula = y ~ s(x, bs = "cs") with method = "REML". Somewhat anecdotally, loess gives a better appearance, but is \(O(N^{2})\) in memory, so does not work for larger datasets.

If you have fewer than 1,000 observations but want to use the same gam() model that method = NULL would use, then set method = "gam", formula = y ~ s(x, bs = "cs").

type

A character string indicating which correlation coefficient is to be computed ("pearson" (default) or "kendall" or "spearman"). "robust" can also be entered but only if output argument is set to either "correlations" or "p-values". The robust correlation used is percentage bend correlation (see ?WRS2::pball). Abbreviations will also work: "p" (for parametric/Pearson's r), "np" (nonparametric/Spearman's rho), "r" (robust).

beta

A numeric bending constant for percentage bend robust correlation coefficient (Default: 0.1).

k

Decides the number of decimal digits to be displayed (Default: 2).

sig.level

Significance level (Default: 0.05). If the p-value in p-value matrix is bigger than sig.level, then the corresponding correlation coefficient is regarded as insignificant and flagged as such in the plot. This argument is relevant only when output = "plot".

conf.level

Scalar between 0 and 1. If unspecified, the defaults return 95% lower and upper confidence intervals (0.95).

p.adjust.method

What adjustment for multiple tests should be used? ("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"). See stats::p.adjust for details about why to use "holm" rather than "bonferroni"). Default is "none". If adjusted p-values are displayed in the visualization of correlation matrix, the adjusted p-values will be used for the upper triangle, while unadjusted p-values will be used for the lower triangle of the matrix.

pch

Decides the glyphs (read point shapes) to be used for insignificant correlation coefficients (only valid when insig = "pch"). Default value is pch = 4.

ggcorrplot.args

A list of additional (mostly aesthetic) arguments that will be passed to ggcorrplot::ggcorrplot function. The list should avoid any of the following arguments since they are already being used: corr, method, p.mat, sig.level, ggtheme, colors, matrix.type, lab, pch, legend.title, digits.

package

Name of package from which the palette is desired as string or symbol.

palette

Name of palette as string or symbol.

direction

Either 1 or -1. If -1 the palette will be reversed.

colors

A vector of 3 colors for low, mid, and high correlation values. If set to NULL, manual specification of colors will be turned off and 3 colors from the specified palette from package will be selected.

ggtheme

A function, ggplot2 theme name. Default value is ggplot2::theme_bw(). Any of the ggplot2 themes, or themes from extension packages are allowed (e.g., ggthemes::theme_fivethirtyeight(), hrbrthemes::theme_ipsum_ps(), etc.).

ggstatsplot.layer

Logical that decides whether theme_ggstatsplot theme elements are to be displayed along with the selected ggtheme (Default: TRUE). theme_ggstatsplot is an opinionated theme layer that override some aspects of the selected ggtheme.

title

The text for the plot title.

subtitle

The text for the plot subtitle.

caption

The text for the plot caption. If NULL, a default caption will be shown.

caption.default

Logical that decides whether the default caption should be shown (default: TRUE).

messages

Decides whether messages references, notes, and warnings are to be displayed (Default: TRUE).

...

Currently ignored.

Value

Correlation matrix plot or correlation coefficient matrix or matrix of p-values.

References

https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggcorrmat.html

See Also

grouped_ggcorrmat ggscatterstats grouped_ggscatterstats

Examples

Run this code
# NOT RUN {
# for reproducibility
set.seed(123)

# if `cor.vars` not specified, all numeric variables used
ggstatsplot::ggcorrmat(data = iris)

# to get the correlalogram
# note that the function will run even if the vector with variable names is
# not of same length as the number of variables
ggstatsplot::ggcorrmat(
  data = ggplot2::msleep,
  cor.vars = sleep_total:bodywt,
  cor.vars.names = c("total sleep", "REM sleep")
) + # further modification using `ggplot2`
  ggplot2::scale_y_discrete(position = "right")

# to get the correlation matrix
ggstatsplot::ggcorrmat(
  data = ggplot2::msleep,
  cor.vars = sleep_total:bodywt,
  output = "r"
)

# setting output = "p-values" (or "p") will return the p-value matrix
ggstatsplot::ggcorrmat(
  data = ggplot2::msleep,
  cor.vars = sleep_total:bodywt,
  corr.method = "r",
  p.adjust.method = "bonferroni",
  output = "p"
)

# setting `output = "ci"` will return the confidence intervals for unique
# correlation pairs
ggstatsplot::ggcorrmat(
  data = ggplot2::msleep,
  cor.vars = sleep_total:bodywt,
  p.adjust.method = "BH",
  output = "ci"
)

# modifying elements of the correlation matrix by changing function defaults
ggstatsplot::ggcorrmat(
  data = datasets::iris,
  cor.vars = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width),
  sig.level = 0.01,
  ggtheme = ggplot2::theme_bw(),
  hc.order = TRUE,
  matrix.type = "lower",
  outline.col = "white",
  title = "Dataset: Iris"
)
# }

Run the code above in your browser using DataLab