ks.test
KolmogorovSmirnov Tests
Perform a one or twosample KolmogorovSmirnov test.
 Keywords
 htest
Usage
ks.test(x, y, ..., alternative = c("two.sided", "less", "greater"), exact = NULL)
Arguments
 x
 a numeric vector of data values.
 y
 either a numeric vector of data values, or a character string
naming a cumulative distribution function or an actual cumulative
distribution function such as
pnorm
. Only continuous CDFs are valid.  ...
 parameters of the distribution specified (as a character
string) by
y
.  alternative
 indicates the alternative hypothesis and must be
one of
"two.sided"
(default),"less"
, or"greater"
. You can specify just the initial letter of the value, but the argument name must be give in full. See ‘Details’ for the meanings of the possible values.  exact
NULL
or a logical indicating whether an exact pvalue should be computed. See ‘Details’ for the meaning ofNULL
. Not available in the twosample case for a onesided test or if ties are present.
Details
If y
is numeric, a twosample test of the null hypothesis
that x
and y
were drawn from the same continuous
distribution is performed.
Alternatively, y
can be a character string naming a continuous
(cumulative) distribution function, or such a function. In this case,
a onesample test is carried out of the null that the distribution
function which generated x
is distribution y
with
parameters specified by ...
.
The presence of ties always generates a warning, since continuous distributions do not generate them. If the ties arose from rounding the tests may be approximately valid, but even modest amounts of rounding can have a significant effect on the calculated statistic.
Missing values are silently omitted from x
and (in the
twosample case) y
.
The possible values "two.sided"
, "less"
and
"greater"
of alternative
specify the null hypothesis
that the true distribution function of x
is equal to, not less
than or not greater than the hypothesized distribution function
(onesample case) or the distribution function of y
(twosample
case), respectively. This is a comparison of cumulative distribution
functions, and the test statistic is the maximum difference in value,
with the statistic in the "greater"
alternative being
$D^+ = max[F_x(u)  F_y(u)]$.
Thus in the twosample case alternative = "greater"
includes
distributions for which x
is stochastically smaller than
y
(the CDF of x
lies above and hence to the left of that
for y
), in contrast to t.test
or
wilcox.test
.
Exact pvalues are not available for the twosample case if onesided
or in the presence of ties. If exact = NULL
(the default), an
exact pvalue is computed if the sample size is less than 100 in the
onesample case and there are no ties, and if the product of
the sample sizes is less than 10000 in the twosample case.
Otherwise, asymptotic distributions are used whose approximations may
be inaccurate in small samples. In the onesample twosided case,
exact pvalues are obtained as described in Marsaglia, Tsang & Wang
(2003) (but not using the optional approximation in the right tail, so
this can be slow for small pvalues). The formula of Birnbaum &
Tingey (1951) is used for the onesample onesided case.
If a singlesample test is used, the parameters specified in
...
must be prespecified and not estimated from the data.
There is some more refined distribution theory for the KS test with
estimated parameters (see Durbin, 1973), but that is not implemented
in ks.test
.
Value

A list with class
 statistic
 the value of the test statistic.
 p.value
 the pvalue of the test.
 alternative
 a character string describing the alternative hypothesis.
 method
 a character string indicating what type of test was performed.
 data.name
 a character string giving the name(s) of the data.
"htest"
containing the following components:
Source
The twosided onesample distribution comes via Marsaglia, Tsang and Wang (2003).
References
Z. W. Birnbaum and Fred H. Tingey (1951), Onesided confidence contours for probability distribution functions. The Annals of Mathematical Statistics, 22/4, 592596.
William J. Conover (1971), Practical Nonparametric Statistics. New York: John Wiley & Sons. Pages 295301 (onesample Kolmogorov test), 309314 (twosample Smirnov test).
Durbin, J. (1973), Distribution theory for tests based on the sample distribution function. SIAM.
George Marsaglia, Wai Wan Tsang and Jingbo Wang (2003), Evaluating Kolmogorov's distribution. Journal of Statistical Software, 8/18. http://www.jstatsoft.org/v08/i18/.
See Also
shapiro.test
which performs the ShapiroWilk test for
normality.
Examples
library(stats)
require(graphics)
x < rnorm(50)
y < runif(30)
# Do x and y come from the same distribution?
ks.test(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?
ks.test(x+2, "pgamma", 3, 2) # twosided, exact
ks.test(x+2, "pgamma", 3, 2, exact = FALSE)
ks.test(x+2, "pgamma", 3, 2, alternative = "gr")
# test if x is stochastically larger than x2
x2 < rnorm(50, 1)
plot(ecdf(x), xlim = range(c(x, x2)))
plot(ecdf(x2), add = TRUE, lty = "dashed")
t.test(x, x2, alternative = "g")
wilcox.test(x, x2, alternative = "g")
ks.test(x, x2, alternative = "l")