Learn R Programming

monomvn (version 1.9-1)

metrics: RMSE, Expected Log Likelihood and KL Divergence Between Two Multivariate Normal Distributions

Description

These functions calculate the root-mean-squared-error, the expected log likelihood, and Kullback-Leibler (KL) divergence (a.k.a. distance), between two multivariate normal (MVN) distributions described by their mean vector and covariance matrix

Usage

rmse.muS(mu1, S1, mu2, S2)
Ellik.norm(mu1, S1, mu2, S2, quiet=FALSE)
kl.norm(mu1, S1, mu2, S2, quiet=FALSE, symm=FALSE)

Arguments

mu1
mean vector of first (estimated) MVN
S1
covariance matrix of first (estimated) MVN
mu2
mean vector of second (true, baseline, or comparator) MVN
S2
covariance matrix of second (true, baseline, or comparator) MVN
quiet
when FALSE (default), gives a warning if the accuracy package cannot be loaded to deal with (possibly) non-positive definite S1 and S2
symm
when TRUE a symmetrized version of the KL divergence is used; see the note below

Value

  • In the case of the expected log likelihood the result is a real number. The RMSE is a positive real number. The KL divergence method returns a positive real number depicting the distance between the two normal distributions

Details

The root-mean-squared-error is calculated between the entries of the mean vectors, and the upper-triangular part of the covariance matrices (including the diagonal). The KL divergence is given by the formula: $$D_{\mbox{\tiny KL}}(N_1 \| N_2) = \frac{1}{2} \left( \log \left( \frac{|\Sigma_1|}{|\Sigma_2|} \right) + \mbox{tr} \left( \Sigma_1^{-1} \Sigma_2 \right) + \left( \mu_1 - \mu_2\right)^\top \Sigma_1^{-1} ( \mu_1 - \mu_2 ) - N \right)$$ where $N$ is length(mu1), and must agree with the dimensions of the other parameters.

The expected log likelihood can be formulated in terms of the KL divergence. That is, the expected log likelihood of data simulated from the normal distribution with parameters mu2 and S2 under the estimated normal with parameters mu1 and S1 is given by

$$-\frac{1}{2} \ln {(2\pi e)^N |\Sigma_2|} - D_{\mbox{\tiny KL}}(N_1 \| N_2).$$ The sechol function from the accuracy package is used to decompose (possibly) non-positive definite S1 and/or S2 to give more stable and robust calculations in the face of numerical instabilities that might occur for larger problems

References

http://faculty.chicagobooth.edu/robert.gramacy/monomvn.html

See Also

posdef.approx

Examples

Run this code
mu1 <- rnorm(5)
s1 <- matrix(rnorm(100), ncol=5)
S1 <- t(s1) %*% s1

mu2 <- rnorm(5)
s2 <- matrix(rnorm(100), ncol=5)
S2 <- t(s2) %*% s2

## RMSE
rmse.muS(mu1, S1, mu2, S2)

## expected log likelihood
Ellik.norm(mu1, S1, mu2, S2)

## KL is not symmetric
kl.norm(mu1, S1, mu2, S2)
kl.norm(mu2, S2, mu1, S1)

## symmetric version
kl.norm(mu2, S2, mu1, S1, symm=TRUE)

Run the code above in your browser using DataLab