lbckden: Cross-validation Log-likelihood of Boundary Corrected Kernel Density Estimation

Description

Cross-validation log-likelihood and negative log-likelihood for boundary corrected kernel density estimation, by treating it as a mixture model.

Usage

lbckden(x, lambda = NULL, extracentres = NULL,
    bcmethod = "simple", proper = TRUE, nn = "jf96",
    offset = 0, xmax = Inf, log = TRUE)

  nlbckden(lambda, x, extracentres = NULL,
    bcmethod = "simple", proper = TRUE, nn = "jf96",
    offset = 0, xmax = Inf, finitelik = FALSE)

Arguments

quantile

lambda

scalar value of fixed bandwidth, or NULL (default)

bcmethod

boundary correction approach

proper

logical, should density be renormalised to integrate to unity, simple boundary correction only

non-negativity correction, so simple boundary correction only

offset

offset added to kernel centres, for logtrans

xmax

upper bound on support, for copula and beta kernels only

log

logical, if TRUE then log density

extracentres

extra kernel centres used in KDE, but likelihood contribution not evaluated, or NULL

finitelik

logical, should log-likelihood return finite value for invalid parameters

Value

lbckden gives cross-validation (log-)likelihood and nbclkden gives the negative cross-validation log-likelihood.

Warning

See warning in fbckden

Details

The cross-validation likelihood functions for the boundary corrected kernel density estimator, as used in the maximum likelihood fitting function fbckden. They are designed to be used for MLE in fbckden but are available for wider usage, e.g. constructing your own extreme value mixture models. All of the boundary correction methods available in bckden are permitted. See fkden and fgpd for full details. The cross-validation likelihood is obtained by leaving each point out in turn, obtaining the usual KDE and evaluate at the point left out: $$L(\lambda)\prod_{i=1}^{n} \hat{f}_{-i}(x_i)$$ where $$\hat{f}_{-i}(x_i) = \frac{1}{(n-1)\lambda} \sum_{j=1: j\ne i}^{n} K(\frac{x_i - x_j}{\lambda})$$ is the KDE obtained when the $i$th datapoint is dropped but is evaluated at $x_i$. Normally for likelihood estimation of the bandwidth the kernel centres and the data where the likelihood is evaluated are the same. However, when using KDE for extreme value mixture modelling the likelihood only those data in the bulk of the distribution should contribute to the likelihood, but all the data (including those beyond the threshold) should contribute to the density estimate. The extracentres option allows the use to specify extra kernel centres used in estimating the density, but not evaluated in the likelihood. The default is to just use the existing data, so extracentres=NULL. Log-likelihood calculations are carried out in lbckden, which takes bandwidth in the same form as distribution functions. The negative log-likelihood is a wrapper for lbckden, designed towards making it useable for optimisation (e.g. parameters are given a vector as first input). The function lbckden carries out the calculations for the log-likelihood directly, which can be exponentiated to give actual likelihood using (log=FALSE).

References

http://en.wikipedia.org/wiki/Kernel_density_estimation http://en.wikipedia.org/wiki/Cross-validation_(statistics) Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360. Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179. MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.