kde: Kernel estimator of the distribution function

Description

Computes the value of the kernel estimator of the distribution function, in a single value or in a grid. Four possibilites for the kernel function are implemented, and the bandwidth parameter can be directly calculated by the plug-in method of Polansky and Baker (2000).

Usage

kde(type_kernel = "n", vec_data, y = NULL, bw = PBbw(type_kernel = "n", 
vec_data, 2))

Arguments

type_kernel

The kernel function. You can use four types: "e" Epanechnikov, "n" Normal, "b" Biweight and "t" Triweight. The Normal kernel is used by default.

vec_data

The data sample.

The single value or the grid vector where the distribution function is estimated. By default, a grid of 100 equidistant points from the minimum to the maximum of the data sample is selected.

The bandwidth used. If it is not provided, the Plug-in bandwidth of Polansky and Baker (2000) is computed.

Value

Estimated_values: Vector containing the estimated function in the grid values.
grid: The used grid.
bw: Value of the bandwidth.

References

Reiss, R.D. (1981) Nonparametric estimation of smooth distribution functions, Scandinavian Journal of Statistics 8, pp:116-119.

Simonoff, J. (1996) Smoothing Methods in Statistics, Springer, New York.

Polansky, A.M. and Baker, E.R. (2000) Multistage plug-in bandwidth selection for kernel distribution function estimates, Journal of Statistical Computation and Simulation 65, pp. 63-80.

Quintela-del-Rio, A. and Estevez-Perez, G. (2012) Nonparametric Kernel Distribution Function Estimation with kerdiest: An R Package for Bandwidth Choice and Applications, Journal of Statistical Software 50(8), pp. 1-21. URL http://www.jstatsoft.org/v50/i08/.

Examples

Run this code

# Comparison of three bandwidth selection methods

x<-rnorm(100)
# The bandwidths by cross-validation, plug-in of Altman and Leger
# and plug-in of Polansky and Baker are calculated, using a normal kernel and a 
# standard setting of parameters, in each case
h_CV<-CVbw(vec_data=x)$bw
# plug-in of Altman and Leger
h_AL<- ALbw(vec_data=x)
# plug-in of Polansky and Baker
h_PB<- PBbw(vec_data=x)
## Not run: print(h_CV); print(h_AL); print(h_PB)
# # plot of the three estimates together with the real distribution
# F_CV<-kde(vec_data=x, bw= h_CV)
# F_AL<-kde(vec_data=x, bw= h_AL)
# F_PB<-kde(vec_data=x, bw= h_PB)
# y<-F_CV$grid
# Ft<-pnorm(y)
# require(graphics)
# plot(y,Ft, ylab="Distribution", xlab="data", type="l", lty=1)
# lines(y,F_CV$Estimated_values, type="l",lty=2)
# lines(y,F_AL$Estimated_values, type="l",lty=3)
# lines(y,F_PB$Estimated_values, type="l",lty=4)
# 
# legend(1,0.4,c("real","F_CV","F_AL","F_PB"),lty=1:4)  ## End(Not run)

Run the code above in your browser using DataLab