dkde: Derivatives of Kernel Density Estimator

Description

The (S3) generic function dkde computes the r'th derivative of kernel density estimator for one-dimensional data. Its default method does so with the given kernel and bandwidth $h$ for one-dimensional observations.

Usage

dkde(x, ...)
## S3 method for class 'default':
dkde(x, y = NULL, deriv.order = 0, h, kernel = c("gaussian", 
         "epanechnikov", "uniform", "triangular", "triweight", 
         "tricube", "biweight", "cosine"), ...)

Arguments

the data from which the estimate is to be computed.

the points of the grid at which the density derivative is to be estimated; the defaults are $\tau * h$ outside of range($x$), where $\tau = 4$.

deriv.order

derivative order (scalar).

the smoothing bandwidth to be used, can also be a character string giving a rule to choose the bandwidth, see h.bcv. The default h.ucv.

kernel

a character string giving the smoothing kernel to be used, with default "gaussian".

...

further arguments for (non-default) methods.

Value

xdata points - same as input.
data.namethe deparsed name of the x argument.
nthe sample size after elimination of missing values.
kernelname of kernel to use.
deriv.orderthe derivative order to use.
hthe bandwidth value to use.
eval.pointsthe coordinates of the points where the density derivative is estimated.
est.fxthe estimated density derivative values.

newcommand

\CRANpkg

href

http://CRAN.R-project.org/package=#1

pkg

eqn

$f(x)$

bold

bias
variance
MSE

deqn

$$MSE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)=\frac{f(x)R\left(K^{(r)}\right)}{nh^{2r+1}}+\frac{1}{4}h^{4}\mu_{2}^{2}(K) f^{(r+1)}(x)^{2}+o(h^{4}+1/nh^{2r+1})$$

cr

The MISE (Mean Integrated Squared Error) can be written as: $$MISE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)=AMISE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)+o(h^{4}+1/nh^{2r+1})$$ where, $$AMISE\left(\hat{f}^{(r)}_{h}(x),f^{(r)}(x)\right)=\frac{1}{nh^{2r+1}}R\left(K^{(r)}\right)+\frac{1}{4}h^{4}\mu_{2}^{2}(K)R\left(f^{(r+2)}\right)$$ with: $R\left(f^{(r)}(x)\right) = \int_{R} \left(f^{(r)}(x)\right)^{2}dx.$ The performance of kernel is measured by MISE or AMISE (Asymptotic MISE). If the bandwidth h is missing from dkde, then the default bandwidth is h.ucv(x,deriv.order,kernel) (Unbiased cross-validation, see h.ucv). For more details see references.

Details

A simple estimator for the density derivative can be obtained by taking the derivative of the kernel density estimate. If the kernel $K(x)$ is differentiable $r$ times then the r'th density derivative estimate can be written as: $$\hat{f}^{(r)}_{h}(x)=\frac{1}{nh^{r+1}}\sum_{i=1}^{n} K^{(r)}\left(\frac{x-X_{i}}{h}\right)$$ where, $$K^{(r)}(x) = \frac{d^{r}}{d x^{r}} K(x)$$ for $r = 0, 1, 2, \dots$ The following assumptions on the density $f^{(r)}(x)$, the bandwidth $h$, and the kernel $K(x)$:

{The $(r+2)$ derivative $f^{(r+2)}(x)$ is continuous, square integrable and ultimately monotone.} {$\lim_{n \to \infty} h = 0$ and $\lim_{n \to \infty}n h^{2r+1} = \infty$ i.e., as the number of samples $n$ is increased $h$ approaches zero at a rate slower than $1/n^{2r+1}$.} {$K(x) \geq 0$ and $\int_{R} K(x) dx = 1$. The kernel function is assumed to be symmetric about the origin i.e., $\int_{R} xK^{(r)}(x) dx = 0$ for even $r$ and has finite second moment i.e., $\mu_{2}(K)=\int_{R}x^{2} K(x) dx < \infty$.}

References

Alekseev, V. G. (1972). Estimation of a probability density function and its derivatives. Mathematical notes of the Academy of Sciences of the USSR. 12 (5), 808--811. Alexandre, B. T. (2009). Introduction to Nonparametric Estimation. Springer-Verlag, New York. Bowman, A. W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford. Bhattacharya, P. K. (1967). Estimation of a probability density function and Its derivatives. Sankhya: The Indian Journal of Statistics, Series A, 29, 373--382. Jeffrey, S. S. (1996). Smoothing Methods in Statistics. Springer-Verlag, New York. Radhey, S. S. (1987). MISE of kernel estimates of a density and its derivatives. Statistics and Probability Letters, 5, 153--159. Scott, D. W. (1992). Multivariate Density Estimation. Theory, Practice and Visualization. New York: Wiley. Schuster, E. F. (1969) Estimation of a probability density function and its derivatives. The Annals of Mathematical Statistics, 40 (4), 1187--1195. Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC. London. Stoker, T. M. (1993). Smoothing bias in density derivative estimation. Journal of the American Statistical Association, 88, 855--863. Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. New York: Springer. Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London. Wolfgang, H. (1991). Smoothing Techniques, With Implementation in S. Springer-Verlag, New York.

Examples

Run this code

## EXAMPLE 1:  Simple example of a Gaussian density derivative

x <- rnorm(100)
dkde(x,deriv.order=0)  ## KDE of f
dkde(x,deriv.order=1)  ## KDDE of d/dx f
dkde(x,deriv.order=2)  ## KDDE of d^2/x^2 f
dkde(x,deriv.order=3)  ## KDDE of d^3/x^3 f
dev.new()
par(mfrow=c(2,2))
plot(dkde(x,deriv.order=0))
plot(dkde(x,deriv.order=1))
plot(dkde(x,deriv.order=2))
plot(dkde(x,deriv.order=3))

## EXAMPLE 2: Bimodal Gaussian density derivative
## show the kernels in the dkde parametrization

fx  <- function(x) 0.5 * dnorm(x,-1.5,0.5) + 0.5 * dnorm(x,1.5,0.5)
fx1 <- function(x) 0.5 *(-4*x-6)* dnorm(x,-1.5,0.5) + 0.5 *(-4*x+6) * 
                   dnorm(x,1.5,0.5)
				   
## 'h = 0.3' ; 'Derivative order = 0'

kernels <- eval(formals(dkde.default)$kernel)
dev.new()
plot(dkde(bimodal,h=0.3),sub=paste("Derivative order = 0",";",
     "Bandwidth =0.3 "),ylim=c(0,0.5), main = "Bimodal Gaussian Density")
for(i in 2:length(kernels))
   lines(dkde(bimodal, h = 0.3, kernel =  kernels[i]), col = i)
curve(fx,add=TRUE,lty=8)
legend("topright", legend = c(TRUE,kernels), col = c("black",seq(kernels)),
          lty = c(8,rep(1,length(kernels))),cex=0.7, inset = .015)
	   
## 'h = 0.6' ; 'Derivative order = 1'

kernels <- eval(formals(dkde.default)$kernel)[-3]
dev.new()
plot(dkde(bimodal,deriv.order=1,h=0.6),main = "Bimodal Gaussian Density Derivative",sub=paste
         ("Derivative order = 1",";","Bandwidth =0.6"),ylim=c(-0.6,0.6))
for(i in 2:length(kernels))
   lines(dkde(bimodal,deriv.order=1, h = 0.6, kernel =  kernels[i]), col = i)
curve(fx1,add=TRUE,lty=8)
legend("topright", legend = c(TRUE,kernels), col = c("black",seq(kernels)),
          lty = c(8,rep(1,length(kernels))),cex=0.7, inset = .015)

Run the code above in your browser using DataLab