# correlate

##### Correlation matrices

Computes a correlation matrix and runs hypothesis tests with corrections for multiple comparisons

##### Usage

`correlate(x, y=NULL, test=FALSE, corr.method="pearson", p.adjust.method="holm")`

##### Arguments

- x
- Matrix or data frame containing variables to be correlated
- y
- Optionally, a second set of variables to be correlated with those in
`x`

- test
- Should hypothesis tests be displayed? (Default=
`FALSE`

) - corr.method
- What kind of correlations should be computed? Default is
`"pearson"`

, but`"spearman"`

and`"kendall"`

are also supported - p.adjust.method
- What method should be used to correct for multiple comparisons. Default value is
`"holm"`

, and the allowable values are the same as for`p.adjust`

##### Details

The `correlate`

function calculates a correlation matrix between all pairs of variables. Much like the `cor`

function, if the user inputs only one set of variables (`x`

) then it computes all pairwise correlations between the variables in `x`

. If the user specifies both `x`

and `y`

it correlates the variables in `x`

with the variables in `y`

.

Unlike the `cor`

function, `correlate`

does not generate an error if some of the variables are categorical (i.e., factors). Variables that are not numeric (or integer) class are simply ignored. They appear in the output, but no correlations are reported for those variables. The decision to have the `correlate`

function allow the user a little leniency when the input contains non-numeric variables should be explained. The motivation is pedagogical rather than statistical. It is sometimes the case in psychology that students need to work with correlation matrices before they are comfortable subsetting a data frame, so it is convenient to allow them to type commands like `correlate(data)`

even when `data`

contains variables for which Pearson/Spearman correlations are not appropriate. (It is also useful to use the output of `correlate`

to illustrate the fact that Pearson correlations should not be used for categorical variables).

A second difference between `cor`

and `correlate`

is that `correlate`

runs hypothesis tests for all correlations in the correlation matrix (using the `cor.test`

function to do the work). The results of the tests are only displayed to the user if `test=TRUE`

. This is a pragmatic choice, given the (perhaps unfortunate) fact that psychologists often want to see the results of these tests: it is probably not coincidental that the `corr.test`

function in the psych package already provides this functionality (though the output is difficult for novices to read).

The concern with running hypothesis tests for all elements of a correlation matrix inflated Type I error rates. To minimise this risk, reported p-values are adjusted using the Holm method. The user can change this setting by specifying `p.adjust.method`

. See `p.adjust`

for details.

Missing data are handled using pairwise complete cases.

##### Value

`correlate`

(an S3 class). It is effectively a list containing four elements: `correlation`

is the correlation matrix, `p.value`

is the matrix of p-values, `sample.size`

is the matrix of sample sizes, and `args`

is a vector that stores information about what the user requested.##### Warning

This package is under development, and has been released only due to teaching constraints. Until this notice disappears from the help files, you should assume that everything in the package is subject to change. Backwards compatibility is NOT guaranteed. Functions may be deleted in future versions and new syntax may be inconsistent with earlier versions. For the moment at least, this package should be treated with extreme caution.

##### See Also

`cor`

, `cor.test`

, `p.adjust`

, `corr.test`

(in the psych package)

##### Examples

`library(lsr)`

```
data <- data.frame(
anxiety = c(1.31,2.72,3.18,4.21,5.55,NA),
stress = c(2.01,3.45,1.99,3.25,4.27,6.80),
depression = c(2.51,1.77,3.34,5.83,9.01,7.74),
happiness = c(4.02,3.66,5.23,6.37,7.83,1.18),
gender = factor( c("male","female","female","male","female","female") ),
ssri = factor( c("no","no","no",NA,"yes","yes") )
)
# default output is just the (Pearson) correlation matrix
correlate( data )
# other types of correlation:
correlate( data, corr.method="spearman" )
# two meaningful subsets to be correlated:
nervous <- data[,c("anxiety","stress")]
happy <- data[,c("happiness","depression","ssri")]
# default output for two matrix input
correlate( nervous, happy )
# the same examples, with Holm-corrected p-values
correlate( data, test=TRUE )
correlate( nervous, happy, test=TRUE )
```

*Documentation reproduced from package lsr, version 0.5, License: GPL-3*