Learn R Programming

kerdiest (version 1.2)

kde: Kernel estimator of the distribution function

Description

Computes the value of the kernel estimator of the distribution function, in a single value or in a grid. Four possibilites for the kernel function are implemented, and the bandwidth parameter can be directly calculated by the plug-in method of Polansky and Baker (2000).

Usage

kde(type_kernel = "n", vec_data, y = NULL, bw = PBbw(type_kernel = "n", vec_data, 2))

Arguments

type_kernel
The kernel function. You can use four types: "e" Epanechnikov, "n" Normal, "b" Biweight and "t" Triweight. The Normal kernel is used by default.
vec_data
The data sample.
y
The single value or the grid vector where the distribution function is estimated. By default, a grid of 100 equidistant points from the minimum to the maximum of the data sample is selected.
bw
The bandwidth used. If it is not provided, the Plug-in bandwidth of Polansky and Baker (2000) is computed.

Value

Estimated_values
Vector containing the estimated function in the grid values.
grid
The used grid.
bw
Value of the bandwidth.

References

Reiss, R.D. (1981) Nonparametric estimation of smooth distribution functions, Scandinavian Journal of Statistics 8, pp:116-119.

Simonoff, J. (1996) Smoothing Methods in Statistics, Springer, New York.

Polansky, A.M. and Baker, E.R. (2000) Multistage plug-in bandwidth selection for kernel distribution function estimates, Journal of Statistical Computation and Simulation 65, pp. 63-80.

Quintela-del-Rio, A. and Estevez-Perez, G. (2012) Nonparametric Kernel Distribution Function Estimation with kerdiest: An R Package for Bandwidth Choice and Applications, Journal of Statistical Software 50(8), pp. 1-21. URL http://www.jstatsoft.org/v50/i08/.

Examples

Run this code
# Comparison of three bandwidth selection methods

x<-rnorm(100)
# The bandwidths by cross-validation, plug-in of Altman and Leger
# and plug-in of Polansky and Baker are calculated, using a normal kernel and a 
# standard setting of parameters, in each case
h_CV<-CVbw(vec_data=x)$bw
# plug-in of Altman and Leger
h_AL<- ALbw(vec_data=x)
# plug-in of Polansky and Baker
h_PB<- PBbw(vec_data=x)
## Not run: print(h_CV); print(h_AL); print(h_PB)
# # plot of the three estimates together with the real distribution
# F_CV<-kde(vec_data=x, bw= h_CV)
# F_AL<-kde(vec_data=x, bw= h_AL)
# F_PB<-kde(vec_data=x, bw= h_PB)
# y<-F_CV$grid
# Ft<-pnorm(y)
# require(graphics)
# plot(y,Ft, ylab="Distribution", xlab="data", type="l", lty=1)
# lines(y,F_CV$Estimated_values, type="l",lty=2)
# lines(y,F_AL$Estimated_values, type="l",lty=3)
# lines(y,F_PB$Estimated_values, type="l",lty=4)
# 
# legend(1,0.4,c("real","F_CV","F_AL","F_PB"),lty=1:4)  ## End(Not run)

Run the code above in your browser using DataLab