Learn R Programming

ks (version 1.8.13)

kcde: Kernel cumulative distribution/survival function estimate for multivariate data

Description

Kernel cumulative distribution/survival function estimate for 1- to 3-dimensional data.

Usage

kcde(x, H, h, gridsize, gridtype, xmin, xmax, supp=3.7, eval.points,
  binned=FALSE, bgridsize, positive=FALSE, adj.positive, w, verbose=FALSE,
  tail.flag="lower.tail")
Hpi.kcde(x, nstage=2, pilot="dunconstr", Hstart, binned=FALSE, bgridsize,
  amise=FALSE, verbose=FALSE, optim.fun="nlm")
hpi.kcde(x, nstage=2, binned=TRUE)

Arguments

x
matrix of data values
H,h
bandwidth matrix/scalar bandwidth. If these are missing, then Hpi.kcde or hpi.kcde is called by default.
gridsize
vector of number of grid points
gridtype
not yet implemented
xmin,xmax
vector of minimum/maximum values for grid
supp
effective support for standard normal
eval.points
points at which estimate is evaluated
binned
flag for binned estimation. Default is FALSE.
bgridsize
vector of binning grid sizes
positive
flag if 1-d data are positive. Default is FALSE.
adj.positive
adjustment applied to positive 1-d data
w
not yet implemented
verbose
flag to print out progress information. Default is FALSE.
tail.flag
"lower.tail" = cumulative distribution, "upper.tail" = survival function
nstage
number of stages in the plug-in bandwidth selector (1 or 2)
pilot
"dscalar" = single pilot bandwidth "dunconstr" = single unconstrained pilot bandwidth
Hstart
initial bandwidth matrix, used in numerical optimisation
amise
flag to return the minimal scaled PI value
optim.fun
optimiser function: one of nlm or optim

Value

  • A kernel cumulative distribution estimate is an object of class kcde which is a list with fields:
  • xdata points - same as input
  • eval.pointspoints at which the estimate is evaluated
  • estimatecumulative distribution/survival function estimate at eval.points
  • hscalar bandwidth (1-d only)
  • Hbandwidth matrix
  • gridtype"linear"
  • griddedflag for estimation on a grid
  • binnedflag for binned estimation
  • namesvariable names
  • wweights
  • tail"lower.tail"=cumulative distribution, "upper.tail"=survival function

Details

If tail.flag="lower.tail" then the cumulative distribution function $\mathrm{Pr}(\bold{X}\leq\bold{x})$ is estimated, otherwise if tail.flag="upper.tail", it is the survival function $\mathrm{Pr}(\bold{X}>\bold{x})$. For d>1, $\mathrm{Pr}(\bold{X}\leq\bold{x}) \neq 1 - \mathrm{Pr}(\bold{X}>\bold{x})$. If the bandwidth H is missing from kcde, then the default bandwidth is the binned 2-stage plug-in selector Hpi.kcde(, nstage=2, binned=TRUE). Likewise for missing h. These bandwidth selectors are optimal for cumulative distribution/survival functions, see Duong (2013).

Binning/exact estimation and positive 1-d data behaviour is the same as for kde. No pre-scaling/pre-sphering is used since the bandwidth selectors Hpi.kcde are not invariant to translation/dilation.

References

Duong, T. (2013) Non-parametric kernel estimation of multivariate cumulative distribution functions and receiver operating characteristic curves. Submitted.

See Also

kde, plot.kcde

Examples

Run this code
library(MASS)
data(iris)
Fhat <- kcde(iris[,1:2])  

## See other examples in ? plot.kcde

Run the code above in your browser using DataLab