# Hypergeometric

##### The Hypergeometric Distribution

Density, distribution function, quantile function and random generation for the hypergeometric distribution.

- Keywords
- distribution

##### Usage

```
dhyper(x, m, n, k, log = FALSE)
phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)
qhyper(p, m, n, k, lower.tail = TRUE, log.p = FALSE)
rhyper(nn, m, n, k)
```

##### Arguments

- x, q
- vector of quantiles representing the number of white balls drawn without replacement from an urn which contains both black and white balls.
- m
- the number of white balls in the urn.
- n
- the number of black balls in the urn.
- k
- the number of balls drawn from the urn.
- p
- probability, it must be between 0 and 1.
- nn
- number of observations. If
`length(nn) > 1`

, the length is taken to be the number required. - log, log.p
- logical; if TRUE, probabilities p are given as log(p).
- lower.tail
- logical; if TRUE (default), probabilities are $P[X \le x]$, otherwise, $P[X > x]$.

##### Details

The hypergeometric distribution is used for sampling *without*
replacement. The density of this distribution with parameters
`m`

, `n`

and `k`

(named $Np$, $N-Np$, and
$n$, respectively in the reference below) is given by
$$p(x) = \left. {m \choose x}{n \choose k-x} \right/ {m+n \choose k}$$
for $x = 0, \ldots, k$.

Note that $p(x)$ is non-zero only for $\max(0, k-n) \le x \le \min(k, m)$.

With $p := m/(m+n)$ (hence $Np = N \times p$ in the reference's notation), the first two moments are mean $$E[X] = \mu = k p$$ and variance $$\mbox{Var}(X) = k p (1 - p) \frac{m+n-k}{m+n-1},$$ which shows the closeness to the Binomial$(k,p)$ (where the hypergeometric has smaller variance unless $k = 1$).

The quantile is defined as the smallest value $x$ such that $F(x) \ge p$, where $F$ is the distribution function.

If one of $m, n, k$, exceeds `.Machine$integer.max`

,
currently the equivalent of `qhyper(runif(nn), m,n,k)`

is used,
when a binomial approximation may be considerably more efficient.

##### Value

`dhyper`

gives the density,`phyper`

gives the distribution function,`qhyper`

gives the quantile function, and`rhyper`

generates random deviates.Invalid arguments will result in return value

`NaN`

, with a warning.The length of the result is determined by

`n`

for`rhyper`

, and is the maximum of the lengths of the numerical arguments for the other functions.The numerical arguments other than

`n`

are recycled to the length of the result. Only the first elements of the logical arguments are used.

##### source

`dhyper`

computes via binomial probabilities, using code
contributed by Catherine Loader (see `dbinom`

).

`phyper`

is based on calculating `dhyper`

and
`phyper(...)/dhyper(...)`

(as a summation), based on ideas of Ian
Smith and Morten Welinder.

`qhyper`

is based on inversion.

`rhyper`

is based on a corrected version of

Kachitvichyanukul, V. and Schmeiser, B. (1985).
Computer generation of hypergeometric random variates.
*Journal of Statistical Computation and Simulation*,
**22**, 127--145.

##### References

Johnson, N. L., Kotz, S., and Kemp, A. W. (1992)
*Univariate Discrete Distributions*,
Second Edition. New York: Wiley.

##### See Also

Distributions for other standard distributions.

##### Examples

`library(stats)`

```
m <- 10; n <- 7; k <- 8
x <- 0:(k+1)
rbind(phyper(x, m, n, k), dhyper(x, m, n, k))
all(phyper(x, m, n, k) == cumsum(dhyper(x, m, n, k))) # FALSE
## but error is very small:
signif(phyper(x, m, n, k) - cumsum(dhyper(x, m, n, k)), digits = 3)
```

*Documentation reproduced from package stats, version 3.3, License: Part of R 3.3*