qdap (version 2.3.2)

diversity: Diversity Statistics

Description

Transcript apply diversity/richness indices.

Usage

diversity(text.var, grouping.var = NULL)

Arguments

text.var

The text variable.

grouping.var

The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.

Value

Returns a dataframe of various diversity related indices for Shannon, collision, Berger Parker and Brillouin.

Details

These are the formulas used to calculate the indices:

Shannon index: $$H_1(X)=-\sum\limits_{i=1}^R{p_i};log;p_i$$

Shannon, C. E. (1948). A mathematical theory of communication. Bell System

Simpson index: $$D=\frac{\sum_{i=1}^R{p_i};n_i(n_i -1)}{N(N-1))}$$

Simpson, E. H. (1949). Measurement of diversity. Nature 163, p. 688

Collision entropy: $$H_2(X)=-log\sum_{i=1}^n{p_i}^2$$

Renyi, A. (1961). On measures of information and entropy. Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, 1960. pp. 547-5661.

Berger Parker index: $$D_{BP}=\frac{N_{max}}{N}$$

Berger, W. H., & Parker, F. L.(1970). Diversity of planktonic Foramenifera in deep sea sediments. Science 168, pp. 1345-1347.

Brillouin index: $$H_B=\frac{ln(N!)-\sum{ln(n_1)!}}{N}$$

Magurran, A. E. (2004). Measuring biological diversity. Blackwell.

References

http://arxiv.org/abs/physics/0512106

Examples

Run this code
# NOT RUN {
div.mod <- with(mraja1spl, diversity(dialogue, list(sex, died, fam.aff)))
colsplit2df(div.mod)
plot(div.mod, high = "red", low = "yellow")
plot(div.mod, high = "red", low = "yellow", values = TRUE)
# }

Run the code above in your browser using DataCamp Workspace