Test for association between paired samples, using one of
Pearson's product moment correlation coefficient,
Kendall's
cor.test(x, …)# S3 method for default
cor.test(x, y,
alternative = c("two.sided", "less", "greater"),
method = c("pearson", "kendall", "spearman"),
exact = NULL, conf.level = 0.95, continuity = FALSE, …)
# S3 method for formula
cor.test(formula, data, subset, na.action, …)
numeric vectors of data values. x
and y
must have the same length.
indicates the alternative hypothesis and must be
one of "two.sided"
, "greater"
or "less"
. You
can specify just the initial letter. "greater"
corresponds
to positive association, "less"
to negative association.
a character string indicating which correlation
coefficient is to be used for the test. One of "pearson"
,
"kendall"
, or "spearman"
, can be abbreviated.
a logical indicating whether an exact p-value should be
computed. Used for Kendall's NULL
(the default).
confidence level for the returned confidence interval. Currently only used for the Pearson product moment correlation coefficient if there are at least 4 complete pairs of observations.
logical: if true, a continuity correction is used
for Kendall's
a formula of the form ~ u + v
, where each of
u
and v
are numeric variables giving the data values
for one sample. The samples must be of the same length.
an optional matrix or data frame (or similar: see
model.frame
) containing the variables in the
formula formula
. By default the variables are taken from
environment(formula)
.
an optional vector specifying a subset of observations to be used.
a function which indicates what should happen when
the data contain NA
s. Defaults to
getOption("na.action")
.
further arguments to be passed to or from methods.
A list with class "htest"
containing the following components:
the value of the test statistic.
the degrees of freedom of the test statistic in the case that it follows a t distribution.
the p-value of the test.
the estimated measure of association, with name
"cor"
, "tau"
, or "rho"
corresponding
to the method employed.
the value of the association measure under the
null hypothesis, always 0
.
a character string describing the alternative hypothesis.
a character string indicating how the association was measured.
a character string giving the names of the data.
a confidence interval for the measure of association. Currently only given for Pearson's product moment correlation coefficient in case of at least 4 complete pairs of observations.
The three methods each estimate the association between paired samples
and compute a test of the value being zero. They use different
measures of association, all in the range
If method
is "pearson"
, the test statistic is based on
Pearson's product moment correlation coefficient cor(x, y)
and
follows a t distribution with length(x)-2
degrees of freedom
if the samples follow independent normal distributions. If there are
at least 4 complete pairs of observation, an asymptotic confidence
interval is given based on Fisher's Z transform.
If method
is "kendall"
or "spearman"
, Kendall's
For Kendall's test, by default (if exact
is NULL), an exact
p-value is computed if there are less than 50 paired samples containing
finite values and there are no ties. Otherwise, the test statistic is
the estimate scaled to zero mean and unit variance, and is approximately
normally distributed.
For Spearman's test, p-values are computed using algorithm AS 89 for
exact = TRUE
, otherwise via the asymptotic
D. J. Best & D. E. Roberts (1975).
Algorithm AS 89: The Upper Tail Probabilities of Spearman's
Myles Hollander & Douglas A. Wolfe (1973), Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 185--194 (Kendall and Spearman tests).
Kendall
in package Kendall.
pKendall
and
pSpearman
in package
SuppDists,
spearman.test
in package
pspearman,
which supply different (and often more accurate) approximations.
# NOT RUN {
## Hollander & Wolfe (1973), p. 187f.
## Assessment of tuna quality. We compare the Hunter L measure of
## lightness to the averages of consumer panel scores (recoded as
## integer values from 1 to 6 and averaged over 80 such values) in
## 9 lots of canned tuna.
x <- c(44.4, 45.9, 41.9, 53.3, 44.7, 44.1, 50.7, 45.2, 60.1)
y <- c( 2.6, 3.1, 2.5, 5.0, 3.6, 4.0, 5.2, 2.8, 3.8)
## The alternative hypothesis of interest is that the
## Hunter L value is positively associated with the panel score.
cor.test(x, y, method = "kendall", alternative = "greater")
## => p=0.05972
cor.test(x, y, method = "kendall", alternative = "greater",
exact = FALSE) # using large sample approximation
## => p=0.04765
## Compare this to
cor.test(x, y, method = "spearm", alternative = "g")
cor.test(x, y, alternative = "g")
## Formula interface.
require(graphics)
pairs(USJudgeRatings)
cor.test(~ CONT + INTG, data = USJudgeRatings)
# }
Run the code above in your browser using DataLab