lbckdengpd: Cross-validation Log-likelihood of Boundary Corrected Kernel Density Estimators for the Bulk and GPD Tail Extreme Value Mixture Model

Description

Cross-validation log-likelihood and negative log-likelihood for the Boundary Corrected Kernel Density Estimators for the Bulk and GPD Tail Extreme Value Mixture Model.

Usage

lbckdengpd(x, lambda = NULL, u = 0, sigmau = 1, xi = 0,
    phiu = TRUE, bcmethod = "simple", proper = TRUE,
    nn = "jf96", offset = 0, xmax = Inf, log = TRUE)

Arguments

quantile

phiu

logical

bcmethod

boundary correction approach

proper

logical, should density be renormalised to integrate to unity, simple boundary correction only

non-negativity correction, so simple boundary correction only

offset

offset added to kernel centres, for logtrans

xmax

upper bound on support, for copula and beta kernels only

lambda

scalar value of fixed bandwidth, or NULL (default)

threshold

sigmau

scale parameter (non-negative)

shape parameter

log

logical, if TRUE then log density

Value

lbckdengpd gives cross-validation (log-)likelihood and nlbckdengpd gives the negative cross-validation log-likelihood.

Warning

See warning in fkden

Details

The cross-validation likelihood functions for the boundary corrected kernel density estimators for the bulk for the bulk below the threshold and GPD for upper tail. As used in the maximum likelihood fitting function fbckdengpd. They are designed to be used for MLE in fbckdengpd but are available for wider usage, e.g. constructing your own extreme value mixture models. See fbckden, fkden and fgpd for full details. Cross-validation likelihood is used for boundary corrected kernel density component, but standard likelihood is used for GPD component. The cross-validation likelihood for the KDE is obtained by leaving each point out in turn, evaluating the KDE at the point left out: $$L(\lambda)\prod_{i=1}^{nb} \hat{f}_{-i}(x_i)$$ where $$\hat{f}_{-i}(x_i) = \frac{1}{(n-1)\lambda} \sum_{j=1: j\ne i}^{n} K(\frac{x_i - x_j}{\lambda})$$ is the boundary corrected KDE obtained when the $i$th datapoint is dropped out and then evaluated at that dropped datapoint at $x_i$. Notice that the coundary corrected KDE sum is indexed over all datapoints ($j=1, ..., n$, except datapoint $i$) whether they are below the threshold or in the upper tail. But the likelihood product is evaluated only for those data below the threshold ($i=1, ..., n_b$). So the $j = n_b+1, ..., n$ datapoints are extra kernel centres from the data in the upper tails which are used in the boundary corrected KDE but the likelihood is not evaluated there. Log-likelihood calculations are carried out in lbckdengpd, which takes bandwidth in the same form as distribution functions. The negative log-likelihood is a wrapper for lbckdengpd, designed towards making it useable for optimisation (e.g. parameters are given a vector as first input). The function lbckdengpd carries out the calculations for the log-likelihood directly, which can be exponentiated to give actual likelihood using (log=FALSE).

References

http://en.wikipedia.org/wiki/Kernel_density_estimation http://en.wikipedia.org/wiki/Cross-validation_(statistics) Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360. Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179. MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.