plfit: Power-law fit (PLFIT) Algorithm

Description

This function implements the PLFIT algorithm as described by Clauset et al. to determine the value of $\hat k$. It minimizes the Kolmorogorov-Smirnoff (KS) distance between the empirical cumulative distribution function and the fitted power law.

Usage

plfit(data, kmax = -1, kmin = 2, na.rm = FALSE)

Value

A named list containing the results of the PLFIT algorithm:

k_hat: The optimal number of top-order statistics $\hat{k}$.
alpha_hat: The estimated power-law exponent $\hat{\alpha}$ corresponding to $\hat{k}$.
xmin_hat: The minimum value $x_{\min} = X_{(\hat{k})}$ above which the power law is fitted.
ks_distance: The minimum Kolmogorov-Smirnov distance $D_{n,k}$ found.

Arguments

data: A numeric vector of i.i.d. observations.
kmax: Maximum number of top-order statistics. If kmax = -1, then kmax=(n-1) where n is the length of dataset
kmin: Minimum number of top-order statistics to start with
na.rm: Logical. If TRUE, missing values (NA) are removed before analysis. Defaults to FALSE.

Details

$$D_{n,k} := \sup_{y \ge 1} |\frac{1}{k-1} \sum_{i=1}^{k-1} I (\frac{X_{(i)}}{X_{(k)}} > y) - y^{-\hat{\alpha}_{n,k}^H}|$$

The above equation, as described by Nair et al., is implemented in this function with an Empirical CDF instead of the empirical survival function, which is mathematical equivalent since they are both complements of each other.

$$D_{n,k} := \sup_{y \ge 1} | \underbrace{ \frac{1}{k-1} \sum_{i=1}^{k-1} I(\frac{X_{(i)}}{X_{(k)}} \le y) }_{\text{Empirical CDF}} - \underbrace{ (1 - y^{-\hat{\alpha}_{n,k}}) }_{\text{Theoretical CDF}}|$$

$$\hat k = \text{argmin} (D_{n,k})$$

References

Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661-703. tools:::Rd_expr_doi("10.1137/070710111")

Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 227-229) tools:::Rd_expr_doi("10.1017/9781009053730")

Examples

Run this code


xmin <- 1
alpha <- 2
r <- runif(800, 0, 1)
x <- (xmin * r^(-1/(alpha)))
plfit_values <- plfit(data = x, kmax = -1, kmin = 2)

Run the code above in your browser using DataLab