Perform a one- or two-sample Kolmogorov-Smirnov test.

```
ks.test(x, y, …,
alternative = c("two.sided", "less", "greater"),
exact = NULL)
```

x

a numeric vector of data values.

y

either a numeric vector of data values, or a character string
naming a cumulative distribution function or an actual cumulative
distribution function such as `pnorm`

. Only continuous CDFs
are valid.

…

parameters of the distribution specified (as a character
string) by `y`

.

alternative

indicates the alternative hypothesis and must be
one of `"two.sided"`

(default), `"less"`

, or
`"greater"`

. You can specify just the initial letter of the
value, but the argument name must be give in full.
See ‘Details’ for the meanings of the possible values.

exact

`NULL`

or a logical indicating whether an exact
p-value should be computed. See ‘Details’ for the meaning of
`NULL`

. Not available in the two-sample case for a one-sided
test or if ties are present.

A list with class `"htest"`

containing the following components:

the value of the test statistic.

the p-value of the test.

a character string describing the alternative hypothesis.

a character string indicating what type of test was performed.

a character string giving the name(s) of the data.

If `y`

is numeric, a two-sample test of the null hypothesis
that `x`

and `y`

were drawn from the same *continuous*
distribution is performed.

Alternatively, `y`

can be a character string naming a continuous
(cumulative) distribution function, or such a function. In this case,
a one-sample test is carried out of the null that the distribution
function which generated `x`

is distribution `y`

with
parameters specified by `…`

.

The presence of ties always generates a warning, since continuous distributions do not generate them. If the ties arose from rounding the tests may be approximately valid, but even modest amounts of rounding can have a significant effect on the calculated statistic.

Missing values are silently omitted from `x`

and (in the
two-sample case) `y`

.

The possible values `"two.sided"`

, `"less"`

and
`"greater"`

of `alternative`

specify the null hypothesis
that the true distribution function of `x`

is equal to, not less
than or not greater than the hypothesized distribution function
(one-sample case) or the distribution function of `y`

(two-sample
case), respectively. This is a comparison of cumulative distribution
functions, and the test statistic is the maximum difference in value,
with the statistic in the `"greater"`

alternative being
\(D^+ = \max_u [ F_x(u) - F_y(u) ]\).
Thus in the two-sample case `alternative = "greater"`

includes
distributions for which `x`

is stochastically *smaller* than
`y`

(the CDF of `x`

lies above and hence to the left of that
for `y`

), in contrast to `t.test`

or
`wilcox.test`

.

Exact p-values are not available for the two-sample case if one-sided
or in the presence of ties. If `exact = NULL`

(the default), an
exact p-value is computed if the sample size is less than 100 in the
one-sample case *and there are no ties*, and if the product of
the sample sizes is less than 10000 in the two-sample case.
Otherwise, asymptotic distributions are used whose approximations may
be inaccurate in small samples. In the one-sample two-sided case,
exact p-values are obtained as described in Marsaglia, Tsang & Wang
(2003) (but not using the optional approximation in the right tail, so
this can be slow for small p-values). The formula of Birnbaum &
Tingey (1951) is used for the one-sample one-sided case.

If a single-sample test is used, the parameters specified in
`…`

must be pre-specified and not estimated from the data.
There is some more refined distribution theory for the KS test with
estimated parameters (see Durbin, 1973), but that is not implemented
in `ks.test`

.

Z. W. Birnbaum and Fred H. Tingey (1951).
One-sided confidence contours for probability distribution functions.
*The Annals of Mathematical Statistics*, **22**/4, 592--596.
10.1214/aoms/1177729550.

William J. Conover (1971).
*Practical Nonparametric Statistics*.
New York: John Wiley & Sons.
Pages 295--301 (one-sample Kolmogorov test),
309--314 (two-sample Smirnov test).

Durbin, J. (1973).
*Distribution theory for tests based on the sample distribution
function*.
SIAM.

George Marsaglia, Wai Wan Tsang and Jingbo Wang (2003).
Evaluating Kolmogorov's distribution.
*Journal of Statistical Software*, **8**/18.
10.18637/jss.v008.i18.

`shapiro.test`

which performs the Shapiro-Wilk test for
normality.

```
# NOT RUN {
require(graphics)
x <- rnorm(50)
y <- runif(30)
# Do x and y come from the same distribution?
ks.test(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?
ks.test(x+2, "pgamma", 3, 2) # two-sided, exact
ks.test(x+2, "pgamma", 3, 2, exact = FALSE)
ks.test(x+2, "pgamma", 3, 2, alternative = "gr")
# test if x is stochastically larger than x2
x2 <- rnorm(50, -1)
plot(ecdf(x), xlim = range(c(x, x2)))
plot(ecdf(x2), add = TRUE, lty = "dashed")
t.test(x, x2, alternative = "g")
wilcox.test(x, x2, alternative = "g")
ks.test(x, x2, alternative = "l")
# }
```

Run the code above in your browser using DataCamp Workspace