Learn R Programming

tolerance (version 0.4.0)

Zipf: The Zipf Distribution

Description

Density (mass), distribution function, quantile function, and random generation for the Zipf distribution with N categories and shape parameter s.

Usage

dzipf(x, s, N, log = FALSE)
pzipf(q, s, N, lower.tail = TRUE, log.p = FALSE)
qzipf(p, s, N, lower.tail = TRUE, log.p = FALSE)
rzipf(n, s, N)

Arguments

x, q
Vector of quantiles.
p
Vector of probabilities.
n
The number of observations. If length>1, then the length is taken to be the number required.
s
The shape parameter, which must be greater than 0.
N
The number of categories, which must be integer-valued.
log, log.p
Logical vectors. If TRUE, then the probabilities are given as log(p).
lower.tail
Logical vector. If TRUE, then probabilities are $P[X\le x]$, else $P[X>x]$.

Value

  • dzipf gives the density (mass), pzipf gives the distribution function, qzipf gives the quantile function, and rzipf generates random deviates.

Details

The Zipf distribution has mass $$p(x) = \frac{x^{-\lambda}}{\sum_{i=1}^{N}i^{-\lambda}},$$ where $x=1,\ldots,N$, $\lambda>0$ is the shape parameter, and N is the number of distinct categories. Note that the Zipf distribution is just a special case of the Zipf-Mandelbrot distribution where the second shape parameter b=0.

References

Zipf, G. K. (1949), Human Behavior and the Principle of Least Effort, Hafner.\ Z"{o}rnig, P. and Altmann, G. (1995), Unified Representation of Zipf Distributions, Computational Statistics and Data Analysis, 19, 461--473.

See Also

runif and .Random.seed about random number generation.

Examples

Run this code
## Randomly generated data from the Zipf distribution.

set.seed(100)
x <- rzipf(n = 500, s = 2, N = 100)
hist(x, main = "Randomly Generated Data", prob = TRUE)

x.1 <- sort(x)
y <- dzipf(x = x.1, s = 2, N = 100)
lines(x.1, y, col = 2, lwd = 2)

plot(x.1, pzipf(q = x.1, s = 2, N = 100), type = "l", 
     xlab = "x", ylab = "Cumulative Probabilities")

qzipf(p = 0.20, s = 2, N = 100, lower.tail = FALSE)
qzipf(p = 0.80, s = 2, N = 100)

Run the code above in your browser using DataLab