qdap (version 2.4.6)

diversity: Diversity Statistics

Description

Transcript apply diversity/richness indices.

Usage

diversity(text.var, grouping.var = NULL)

Value

Returns a dataframe of various diversity related indices for Shannon, collision, Berger Parker and Brillouin.

Arguments

text.var

The text variable.

grouping.var

The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.

Details

These are the formulas used to calculate the indices:

Shannon index: $$H_1(X)=-\sum\limits_{i=1}^R{p_i};log;p_i$$

Shannon, C. E. (1948). A mathematical theory of communication. Bell System

Simpson index: $$D=\frac{\sum_{i=1}^R{p_i};n_i(n_i -1)}{N(N-1))}$$

Simpson, E. H. (1949). Measurement of diversity. Nature 163, p. 688

Collision entropy: $$H_2(X)=-log\sum_{i=1}^n{p_i}^2$$

Renyi, A. (1961). On measures of information and entropy. Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, 1960. pp. 547-5661.

Berger Parker index: $$D_{BP}=\frac{N_{max}}{N}$$

Berger, W. H., & Parker, F. L.(1970). Diversity of planktonic Foramenifera in deep sea sediments. Science 168, pp. 1345-1347.

Brillouin index: $$H_B=\frac{ln(N!)-\sum{ln(n_1)!}}{N}$$

Magurran, A. E. (2004). Measuring biological diversity. Blackwell.

References

https://arxiv.org/abs/physics/0512106

Examples

Run this code
if (FALSE) {
div.mod <- with(mraja1spl, diversity(dialogue, list(sex, died, fam.aff)))
colsplit2df(div.mod)
plot(div.mod, high = "red", low = "yellow")
plot(div.mod, high = "red", low = "yellow", values = TRUE)
}

Run the code above in your browser using DataLab