Learn R Programming

entropy (version 1.1.4)

entropy.Dirichlet: Family of Dirichlet Entropy and Mutual Information Estimators

Description

entropy.Dirichlet estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y by plug-in of Bayesian estimates of the bin frequencies using the Dirichlet-multinomial pseudocount model.

mi.Dirichlet estimates the corresponding mutual information of two random variables.

freqs.Dirichlet computes the Bayesian estimates of the bin frequencies using the Dirichlet-multinomial pseudocount model.

Usage

entropy.Dirichlet(y, a, unit=c("log", "log2", "log10"))
mi.Dirichlet(y, a, unit=c("log", "log2", "log10"))
freqs.Dirichlet(y, a)

Arguments

y
vector or matrix of counts.
a
pseudocount per bin.
unit
the unit in which entropy is measured.

Value

  • entropy.Dirichlet returns an estimate of the Shannon entropy.

    mi.Dirichlet returns an estimate of the mutual information.

    freqs.Dirichlet returns the underlying frequencies.

code

a

Details

The Dirichlet-multinomial pseudocount entropy estimator is a Bayesian plug-in estimator: in the definition of the Shannon entropy the bin probabilities are replaced by the respective Bayesian estimates of the frequencies, using a model with a Dirichlet prior and a multinomial likelihood.

The parameter a is a parameter of the Dirichlet prior, and in effect specifies the pseudocount per bin. Popular choices of a are:

  • a=0:
{maximum likelihood estimator (see entropy.empirical) } a=1/2:{Jeffreys' prior; Krichevsky-Trovimov (1991) entropy estimator} a=1:{Laplace's prior} a=1/length(y):{Schurmann-Grassberger (1996) entropy estimator} a=sqrt(sum(y))/length(y):{minimax prior}

References

Agresti, A., and D. B. Hitchcock. 2005. Bayesian inference for categorical data analysis. Stat. Methods. Appl. 14:297--330. Krichevsky, R. E., and V. K. Trofimov. 1981. The performance of universal encoding. IEEE Trans. Inf. Theory 27: 199-207.

Schurmann, T., and P. Grassberger. 1996. Entropy estimation of symbol sequences. Chaos 6:41-427.

See Also

entropy, entropy.shrink, entropy.NSB, entropy.ChaoShen, entropy.empirical, entropy.plugin, mi.plugin.

Examples

Run this code
# load entropy library 
library("entropy")

# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)  

# Dirichlet estimate with a=0
entropy.Dirichlet(y, a=0)

# compare to empirical estimate
entropy.empirical(y)

# Dirichlet estimate with a=1/2
entropy.Dirichlet(y, a=1/2)

# Dirichlet estimate with a=1
entropy.Dirichlet(y, a=1)

# Dirichlet estimate with a=1/length(y)
entropy.Dirichlet(y, a=1/length(y))

# Dirichlet estimate with a=sqrt(sum(y))/length(y)
entropy.Dirichlet(y, a=sqrt(sum(y))/length(y))


# contigency table with counts for two discrete variables
y = rbind( c(1,2,3), c(6,5,4) )

# Dirichlet estimate of mutual information (with a=1/2)
mi.Dirichlet(y, a=1/2)

Run the code above in your browser using DataLab