Computes the proportion of categories contributing to the lower part of the distribution.
Useful to quantify long-tail structure in nominal distributions.
Usage
tail_index(df, var, threshold = 0.8)
Value
Numeric between 0 and 1 representing the tail proportion.
Arguments
df
A data.frame or tibble containing the variable.
var
Character. Name of the nominal variable in df.
threshold
Numeric. Cumulative proportion of counts defining the "dominant" categories (default 0.8).