Learn R Programming

Conake (version 1.0.1)

Conake-package: Continuous Associated Kernel Estimation

Description

Continuous smoothing of probability density function defined on a compact \(T=[a,b]\) or semi-infinite support \(T=[0,\infty)\) is performed using four continuous associated kernels: extended beta, gamma, lognormal and reciprocal inverse Gaussian. The cross-validation technique is also implemented to select the smoothing parameter.

Arguments

Details

The estimated density:

The kernel estimator \(\widehat{f}_n\) of \(f\) is defined as

$$\widehat{f}_n(x) = \frac{1}{n}\sum_{i=1}^{n}{K_{x,h}(X_i)},$$ where \(K_{x,h}\) is one of the kernels defined below. In practice, we first calculate the normalizing constant

$${C}_n = \int_{x\in T}{\widehat{f}_n(x)dx},$$ where T is the support of the density function. This normalizing constant is not generally equal to 1. The estimated density is then \(\tilde{f}_n=\widehat{f}_n/C_n\).

Given a data sample, the Conake package allows to compute the density dke using one of the four kernel functions: extended beta, gamma, lognormal and reciprocal inverse Gaussian. The bandwidth parameter is calculated using the cross-validation technique cvbw.The kernel functions kef are defined below.

Extended beta kernel :

The extended beta kernel is defined on \({S}_{x,h,a,b}=[a,b]=T\) with \(a<b<\infty\), \(x \in T\) and \(h>0\):

$$BE_{x,h,a,b}(y) = \frac {(y-a)^{(x-a)/\{(b-a)h\}}(b-y)^{(b-x)/\{(b-a)h\}}} {(b-a)^{1+h^{-1}}B\left(1+(x-a)/(b-a)h,1+(b-x)/(b-a)h\right)}1_{S_{x,h,a,b}}(y),$$

where \(B(r,s)=\int_0^1 t^{r-1}(1-t)^{s-1}dt\) is the usual beta function with \(r>0\), \(s>0\) and \(1_A\) denotes the indicator function of A. For \(a=0\) and \(b=1\), the extended beta kernel corresponds to the beta kernel which is the probability density function of the beta distribution with shape parameters \(1+x/h\) and \((1-x)/h\); see Libengu<U+00E9> (2013).

Gamma kernel:

The gamma kernel is defined on \({S}_{x,h}=[0,+\infty)=T\) with \(x \in T\) and \(h>0\):

$$GA_{x,h}(y) = \frac {y^{x/h}} {\Gamma(1+x/h)h^{1+x/h}}exp\left(-\frac{y}{h} \right)1_{S_{x,h}}(y),$$

where \(\Gamma(.) \) is the classical gamma function. It is the probability density function of the gamma distribution with scale parameter \(1+x/h\) and shape parameter \(h\); see Chen (2000) and also Libengu<U+00E9> (2013).

Lognormal kernel :

The lognormal kernel is defined on \({S}_{x,h}=[0,\infty)=T\) with \(x \in T\) and \(h>0\):

$$LN_{x,h}(y) = \frac {1} {yh\sqrt{2\pi}}exp\left\{-\frac{1}{2}\left(\frac{1}{h}log(\frac{y}{x})-h \right)^{2}\right\}1_{S_{x,h}}(y).$$

It is the probability densiy function of the classical lognormal distribution with mean \(log(x)+h^{2}\) and standard deviation \(h\); see Igarashi and Kakizawa (2015) and also Libengu<U+00E9> (2013).

Reciprocal inverse Gaussian kernel:

The reciprocal inverse Gaussian kernel is defined on \({S}_{x,h}=]0,\infty)=T\) with \(x \in T\) and \(h>0\):

$$RIG_{x,h}(y) = \frac {1}{\sqrt{2\pi hy}} exp\left\{-\frac{\zeta(x,h)}{2h}\left(\frac{y}{\zeta(x,h)}-2+\frac{\zeta(x,h)}{y}\right)\right\}1_{S_{x,h}}(y),$$

where \(\zeta(x,h)=(x^2+xh)^{1/2}\). It is the probability densiy function of the classical reciprocal inverse Gaussian distribution with mean \(1/\sqrt{x^2+xh}\) and standard deviation \(1/h\); see Igarashi and Kakizawa (2015) and also Libengu<U+00E9> (2013).

The bandwidth selection:

The cross-validation technique cvbw is used for the bandwidth selection. The optimal parameter is the one which minimizes the cross-validation function defined by:

$$CV(h)=\int_{x\in T}{\{\widehat{f}_n(x)\}^{2}dx}-\frac{2}{n}\sum_{i=1}^{n}{\widehat{f}_{n,-i}(X_i)},$$

where \(\widehat{f}_{n,-i}(X_i)=(n-1)^{-1}\sum_{j \ne i}^{n}K_{X_i,h}(X_j)\) is the density estimator computed without the observation \(X_{i}\).

References

Chen, S. X. (1999). Beta kernels estimators for density functions, Computational Statistics and Data Analysis 31, 131 - 145.

Chen, S. X. (2000). Gamma kernels estimators for density functions, Annals of the Institute of Statistical Mathematics 52, 471 - 480.

Libengu<U+00E9>, F.G. (2013). M<U+00E9>thode Non-Param<U+00E9>trique par Noyaux Associ<U+00E9>s Mixtes et Applications, Ph.D. Thesis Manuscript (in French) to Universit<U+00E9> de Franche-Comt<U+00E9>, Besan<U+00E7>on, France and Universit<U+00E9> de Ouagadougou, Burkina Faso, June 2013, LMB no. 14334, Besan<U+00E7>on.

Igarashi, G. and Kakizawa, Y. (2015). Bias correction for some asymmetric kernel estimators, Journal of Statistical Planning and Inference 159, 37 - 63.