ddhellingerpar: Distance between discrete probability distributions given the probabilities on their common support

Description

Hellinger (or Matusita) distance between two discrete probability distributions on the same support (which can be a Cartesian product of \(q\) sets) , given the probabilities of the states (which are \(q\)-tuples) of the support.

Usage

ddhellingerpar(p1, p2)

Arguments

p1: array (or table) the dimension of which is \(q\). The first probability distribution on the support.
p2: array (or table) the dimension of which is \(q\). The second probability distribution on the support.

Author

Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Sabine Demotes-Mainard

Details

The Hellinger distance between two discrete distributions \(p_1\) and \(p_2\) is given by: \(\sqrt{ \sum_x{(\sqrt{p_1(x)} - \sqrt{p_2(x)})^2}} \)

Notice that some authors divide this expression by \(\sqrt{2}\).

References

Deza, M.M. and Deza E. (2013). Encyclopedia of distances. Springer.

Examples

Run this code

# Example 1
p1 <- array(c(1/2, 1/2), dimnames = list(c("a", "b"))) 
p2 <- array(c(1/4, 3/4), dimnames = list(c("a", "b"))) 
ddhellingerpar(p1, p2)

# Example 2
x1 <- data.frame(x = factor(c("A", "A", "A", "B", "B", "B")),
                 y = factor(c("a", "a", "a", "b", "b", "b")))                 
x2 <- data.frame(x = factor(c("A", "A", "A", "B", "B")),
                 y = factor(c("a", "a", "b", "a", "b")))
p1 <- table(x1)/nrow(x1)                 
p2 <- table(x2)/nrow(x2)
ddhellingerpar(p1, p2)

Run the code above in your browser using DataLab