# Hypergeometric

##### The Hypergeometric Distribution

Density, distribution function, quantile function and random generation for the hypergeometric distribution.

- Keywords
- distribution

##### Usage

```
dhyper(x, m, n, k, log = FALSE)
phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)
qhyper(p, m, n, k, lower.tail = TRUE, log.p = FALSE)
rhyper(nn, m, n, k)
```

##### Arguments

- x, q
vector of quantiles representing the number of white balls drawn without replacement from an urn which contains both black and white balls.

- m
the number of white balls in the urn.

- n
the number of black balls in the urn.

- k
the number of balls drawn from the urn.

- p
probability, it must be between 0 and 1.

- nn
number of observations. If

`length(nn) > 1`

, the length is taken to be the number required.- log, log.p
logical; if TRUE, probabilities p are given as log(p).

- lower.tail
logical; if TRUE (default), probabilities are \(P[X \le x]\), otherwise, \(P[X > x]\).

##### Details

The hypergeometric distribution is used for sampling *without*
replacement. The density of this distribution with parameters
`m`

, `n`

and `k`

(named \(Np\), \(N-Np\), and
\(n\), respectively in the reference below) is given by
$$
p(x) = \left. {m \choose x}{n \choose k-x} \right/ {m+n \choose k}%
$$
for \(x = 0, \ldots, k\).

Note that \(p(x)\) is non-zero only for \(\max(0, k-n) \le x \le \min(k, m)\).

With \(p := m/(m+n)\) (hence \(Np = N \times p\) in the reference's notation), the first two moments are mean $$E[X] = \mu = k p$$ and variance $$\mbox{Var}(X) = k p (1 - p) \frac{m+n-k}{m+n-1},$$ which shows the closeness to the Binomial\((k,p)\) (where the hypergeometric has smaller variance unless \(k = 1\)).

The quantile is defined as the smallest value \(x\) such that \(F(x) \ge p\), where \(F\) is the distribution function.

If one of \(m, n, k\), exceeds `.Machine$integer.max`

,
currently the equivalent of `qhyper(runif(nn), m,n,k)`

is used,
when a binomial approximation may be considerably more efficient.

##### Value

`dhyper`

gives the density,
`phyper`

gives the distribution function,
`qhyper`

gives the quantile function, and
`rhyper`

generates random deviates.

Invalid arguments will result in return value `NaN`

, with a warning.

The length of the result is determined by `n`

for
`rhyper`

, and is the maximum of the lengths of the
numerical arguments for the other functions.

The numerical arguments other than `n`

are recycled to the
length of the result. Only the first elements of the logical
arguments are used.

##### References

Johnson, N. L., Kotz, S., and Kemp, A. W. (1992)
*Univariate Discrete Distributions*,
Second Edition. New York: Wiley.

##### See Also

Distributions for other standard distributions.

##### Examples

`library(stats)`

```
# NOT RUN {
m <- 10; n <- 7; k <- 8
x <- 0:(k+1)
rbind(phyper(x, m, n, k), dhyper(x, m, n, k))
all(phyper(x, m, n, k) == cumsum(dhyper(x, m, n, k))) # FALSE
# }
# NOT RUN {
## but error is very small:
signif(phyper(x, m, n, k) - cumsum(dhyper(x, m, n, k)), digits = 3)
# }
```

*Documentation reproduced from package stats, version 3.5.0, License: Part of R 3.5.0*