Learn R Programming

tolerance (version 0.4.0)

zipftol.int: Zipf-Mandelbrot Tolerance Intervals

Description

Provides 1-sided or 2-sided tolerance intervals for data distributed according to Zipf, Zipf-Mandelbrot, and zeta distributions.

Usage

zipftol.int(x, N = NULL, alpha = 0.05, P = 0.99, side = 1, s = 1,
            b = 1, dist = c("Zipf", "Zipf-Man", "Zeta"),
            exact = TRUE, ...)

Arguments

x
A vector or table of counts which is distributed according to a Zipf, Zipf-Mandelbrot, or zeta distribution.
N
The number of categories when dist = "Zipf" or dist = "Zipf-Man". This is not used when dist = "Zeta". If N = NULL, then N is estimated based on the number of categories observed in the da
alpha
The level chosen such that 1-alpha is the confidence level.
P
The proportion of the population to be covered by this tolerance interval.
side
Whether a 1-sided or 2-sided tolerance interval is required (determined by side = 1 or side = 2, respectively).
s
The initial value to estimate the shape parameter in the zm.ll function.
b
The initial value to estimate the second shape parameter in the zm.ll function when dist = "Zipf-Man".
dist
Options are dist = "Zipf", dist = "Zipf-Man", or dist = "Zeta" if the data is distributed according to the Zipf, Zipf-Mandelbrot, or zeta distribution, respectively.
exact
If exact = TRUE, then an ordinal ranking (based on the category labels) of the data is used. If exact = FALSE, then a Zipfian ranking of the data will be used (i.e., the data will be arranged by the raw counts in decreasing orde
...
Additional arguments passed to the zm.ll function, which is used for maximum likelihood estimation.

Value

  • zipftol.int returns a data frame with the following items:
  • alphaThe specified significance level.
  • PThe proportion of the population covered by this tolerance interval.
  • s.hatMLE for the shape parameter s.
  • b.hatMLE for the shape parameter b when dist = "Zipf-Man".
  • 1-sided.lowerThe 1-sided lower tolerance bound. This is given only if side = 1.
  • 1-sided.upperThe 1-sided upper tolerance bound. This is given only if side = 1.
  • 2-sided.lowerThe 2-sided lower tolerance bound. This is given only if side = 2.
  • 2-sided.upperThe 2-sided upper tolerance bound. This is given only if side = 2.

Details

Zipf-Mandelbrot models are commonly used to model phenomena where the frequencies of categorical data are approximately inversely proportional to its rank in the frequency table. Zipf-Mandelbrot distributions are heavily right-skewed distributions with a (relatively) large mass placed on the first category. For most practical applications, one will typically be interested in 1-sided upper bounds.

References

Mandelbrot, B. B. (1965), Information Theory and Psycholinguistics. In B. B. Wolman and E. Nagel, editors. Scientific Psychology, Basic Books.\ Zipf, G. K. (1949), Human Behavior and the Principle of Least Effort, Hafner.\ Z"{o}rnig, P. and Altmann, G. (1995), Unified Representation of Zipf Distributions, Computational Statistics and Data Analysis, 19, 461--473.

See Also

Zeta, Zipf, ZipfMandelbrot, zm.ll

Examples

Run this code
## 95\%/99\% 1-sided tolerance intervals for the Zipf, 
## Zipf-Mandelbrot, and zeta distributions. 

set.seed(100)

s <- 2
b <- 5
N <- 50

zipf.data <- rzipf(n = 500, s = s, N = N)
zipfman.data <- rzipfman(n = 500, s = s, b = b, N = N)
zeta.data <- rzeta(n = 200, s = s)

out.zipf <- zipftol.int(zipf.data, N = N, dist = "Zipf")
out.zipfman <- zipftol.int(zipfman.data, N = N,
                           dist = "Zipf-Man")
out.zeta <- zipftol.int(zeta.data, dist = "Zeta")

out.zipf
out.zipfman
out.zeta

Run the code above in your browser using DataLab