kullCOP: Kullback-Leibler Divergence, Jeffrey's Divergence, and Kullback-Leibler Sample Size

Description

Compute the Kullback-Leibler divergence, Jeffrey's divergence, and Kullback-Leibler sample size following Joe (2015, pp. 234--237). Consider two densities $f = c_1(u,v; \Theta_f)$ and $g = c_2(u,v; \Theta_g)$ for two different bivariate copulas $\mathbf{C_1(\Theta_1)}$ and $\mathbf{C_2(\Theta_2)}$ having respective parameters $\Theta$, then the Kullback-Leibler divergence of $f$ relative to $g$ is $$\mathrm{KL}(f|g) = \int\!\!\int_{\mathcal{I}^2} g\, \log(g/f)\,\mathrm{d}u\mathrm{d}v\mbox{,}$$ and Kullback-Leibler divergence of $g$ relative to $f$ is $$\mathrm{KL}(g|f) = \int\!\!\int_{\mathcal{I}^2} f\, \log(f/g)\,\mathrm{d}u\mathrm{d}v\mbox{,}$$ where the limits of integration $\mathcal{I}^2$ theoretically are closed on $[0,1]^2$ but an open interval $(0,1)$ might be needed for numerical stability for the nested calls of the integrate function because of the use of rectangular density estimation by densityCOP. Note that in general $\mathrm{KL}(f|g) \ne \mathrm{KL}(g|f)$. The $\mathrm{KL}(f|g)$ is the expected log-likelihood ratios of $g$ to $f$ when $g$ is the true density (Joe, 2015, p. 234), whereas $\mathrm{KL}(g|f)$ is the opposite. This asymmetry leads to Jeffrey's divergence, which is defined as a symmetrized version of the Kullback-Leibler divergences and is $$J(f,g) = \mathrm{KL}(f|g) + \mathrm{KL}(g|f) = \int\!\!\int_{\mathcal{I}^2} (g-f)\, \log(g/f)\,\mathrm{d}u\mathrm{d}v\mbox{.}$$

The variances of the Kullback-Leibler divergences are defined as $$\sigma^2_{\mathrm{KL}(f|g)} = \int\!\!\int_{\mathcal{I}^2} g\,[\log(g/f)]^2\,\mathrm{d}u\mathrm{d}v - [\mathrm{KL}(f|g)]^2\mbox{,}$$ and $$\sigma^2_{\mathrm{KL}(g|f)} = \int\!\!\int_{\mathcal{I}^2} f\,[\log(f/g)]^2\,\mathrm{d}u\mathrm{d}v - [\mathrm{KL}(g|f)]^2\mbox{.}$$

For comparison of copula families $f$ and $g$ and taking an $\alpha = 0.05$, the Kullback-Leibler sample size is defined as $$n_{fg} = [\Phi^{(-1)}(1-\alpha) \times \eta_\mathrm{KL}]^2\mbox{,}$$ where $\Phi^{(-1)}(t)$ is the quantile function for the Standard Normal distribution for nonexceedance probability $t$, and $\eta$ is the maximum of $$\eta_\mathrm{KL} = \mathrm{max}[\sigma_{\mathrm{KL}(f|g)}/\mathrm{KL}(f|g),\, \sigma_{\mathrm{KL}(g|f)}/\mathrm{KL}(g|f)]\mbox{.}$$ The $n_{fg}$ gives an indication of the sample size needed to distinguish $f$ and $g$ with a probability of at least $1 - \alpha = 1 - 0.05 = 0.95$ or 95 percent.

Usage

kullCOP(cop1=NULL, cop2=NULL, para1=NULL, para2=NULL, alpha=0.05,
        del=.Machine$double.eps^0.25, ...)

Arguments

cop1

A copula function corresponding to copula $f$ in Joe (2015);

para1

Vector of parameters or other data structure, if needed, to pass to the copula $f$;

cop2

A copula function corresponding to copula $g$ in Joe (2015);

para2

Vector of parameters or other data structure, if needed, to pass to the copula $g$;

alpha

The $\alpha$ in the Kullback-Leibler sample size equation;

del

A small value used to denote the lo and hi values of the numerical integration: lo = del and hi = 1 - del. If del == 0, then lo = 0 and hi = 1, which corresponds to the theoretical limits $\

...

Additional arguments to pass to the densityCOP function.

Value

An Rlist is returned having the following components:
divergencesA vector of the Kullback-Leibler divergences and their standard deviations: $\mathrm{KL}(f|g)$, $\sigma_{\mathrm{KL}(f|g)}$, $\mathrm{KL}(g|f)$, $\sigma_{\mathrm{KL}(g|f)}$;
Jeffrey.divergenceJeffrey's divergence $J(f,g)$;
KL.sample.sizeKullback-Leibler sample size $n_{fg}$; and
integrationsAn Rlist of the outer integrate function call for the respective numerical integrals shown in this documentation.

encoding

utf8

concept

Kullback-Leibler

References

Joe, H., 2015, Dependence modeling with copulas: Boca Raton, CRC Press, 462 p.

Examples

Run this code

# Joe (2015, p. 237, table 5.2)
# The Gumbel-Hougaard copula and Plackett copula below each have a Kendall's tau
# of about 0.5. Joe (2015) lists in the table that Jeffrey's divergence is about
# 0.110 and the Kullback-Leibler sample size is 133. Joe (2015) does not list
# the parameters for either copula, just that for tau = 0.5.
KL <- kullCOP(cop1=GHcop, para1=2, cop2=PLACKETTcop, para2=11.40484)
# Reports Jeffrey's divergence      =   0.1087776
#      Kullback-Leibler sample size = 132
# using the default open set for the nested integrations. The closed set [0,1]
# causes integral divergence error for the Plackett copula.

# Joe (2015, p. 237, table 5.3)
# The Gumbel-Hougaard copula and Plackett copula below each have a Spearman's rho
# of about 0.5. Joe (2015) lists in the table that Jeffrey's divergence is about
# 0.063 and the Kullback-Leibler sample size is 210. Joe (2015) does not list
# the parameters for either copula, just that for rho = 0.5.
KL <- kullCOP(cop1=GHcop, para1=1.541071, cop2=PLACKETTcop, para2=5.115658,
                                                      del=0.0002, deluv=0.00001)
# Reports Jeffrey's divergence      =   0.06057848
#      Kullback-Leibler sample size = 207
# using the default open set for the nested integrations. The closed set [0,1]
# causes integral divergence error for the Plackett copula. Adjustments are made
# to the del for the numerical integrations and the deluv of the rectangular
# probability densities.  Joe (2015) likely did the numerical integrations using
# analytical solutions to the probability densities and not rectangular
# approximations as used in this package by the densityCOP() function.

Run the code above in your browser using DataLab