corrr (version 0.4.0)

correlate: Correlation Data Frame

Description

An implementation of stats::cor(), which returns a correlation data frame rather than a matrix. See details below. Additional adjustment include the use of pairwise deletion by default.

Usage

correlate(x, y = NULL, use = "pairwise.complete.obs",
  method = "pearson", diagonal = NA, quiet = FALSE)

Arguments

x

a numeric vector, matrix or data frame.

y

NULL (default) or a vector, matrix or data frame with compatible dimensions to x. The default is equivalent to y = x (but more efficient).

use

an optional character string giving a method for computing covariances in the presence of missing values. This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs".

method

a character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman": can be abbreviated.

diagonal

Value (typically numeric or NA) to set the diagonal to.

quiet

Set as TRUE to suppress message about `method` and `use` parameters.

Value

A correlation data frame (cor_df)

Details

  • A tibble (see tibble)

  • An additional class, "cor_df"

  • A "rowname" column

  • Standardized variances (the matrix diagonal) set to missing values by default (NA) so they can be ignored in calculations.

Examples

Run this code
# NOT RUN {
correlate(iris)
# }
# NOT RUN {
correlate(iris[-5])

correlate(mtcars)

# }
# NOT RUN {
# Also supports DB backend and collects results into memory

library(sparklyr)
sc <- spark_connect(master = "local")
mtcars_tbl <- copy_to(sc, mtcars)
mtcars_tbl %>% 
  correlate(use = "pairwise.complete.obs", method = "spearman")
spark_disconnect(sc)

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab