Learn R Programming

TDA (version 1.0)

bootstrapBand: Bootstrap Confidence Band

Description

bootstrapBand computes a uniform symmetric confidence band around a function of the data X, evaluated on a Grid, using the bootstrap algorithm. See Details and References.

Usage

bootstrapBand(X, FUN, Grid, B = 30, alpha = 0.05, parallel = FALSE, 
              printStatus=FALSE, ...)

Arguments

X
an $n$ by $d$ matrix of coordinates of points used by the function FUN, where $n$ is the number of points and $d$ is the dimension.
FUN
a function whose inputs are an $n$ by $d$ matrix of coordinates X, an $m$ by $d$ matrix of coordinates Grid and returns a numeric vector of length $m$. For example see distFct,
Grid
an $m$ by $d$ matrix of coordinates, where $m$ is the number of points in the grid.
B
the number of bootstrap iterations.
alpha
bootstrapBand returns a (1-alpha) confidence band.
parallel
logical: if TRUE the bootstrap iterations are parallelized, using the library parallel.
printStatus
if TRUE a progress bar is printed. Default is FALSE.
...
additional parameters for the function FUN.

Value

  • Returns a list with the following elements:
  • widthnumber: (1-alpha) quantile of the values computed by the bootstrap algorithm. It corresponds to half of the width of the unfiorm confidence band; that is, width is the distance of the upper and lower limits of the band from the function evaluated using the original dataset X.
  • funa numeric vector of length $m$, storing the values of the input function FUN, evaluated on the Grid using the original data X.
  • bandan $m$ by 2 matrix that stores the values of the lower limit of the confidence band (first column) and upper limit of the confidence band (second column), evaluated over the Grid.

Details

First, the input function FUN is evaluated on the Grid using the original data X. Then, for B times, the bootstrap algorithm subsamples n points of X (with replacement), evaluates the function FUN on the Grid using the subsample, and computes the $\ell_\infty$ distance between the original function and the bootstrapped one. The result is a sequence of B values. The (1-alpha) confidence band is constructed by taking the (1-alpha) quantile of these values.

References

Larry Wasserman (2004), "All of statistics: a concise course in statistical inference", Springer.

Brittany T. Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh. (2013), "Statistical Inference For Persistent Homology: Confidence Sets for Persistence Diagrams", (arXiv:1303.7117). To appear, Annals of Statistics.

Chazal F, Fasy BT, Lecci F, Michel B, Rinaldo A, Wasserman L (2014). "Robust Topological Inference: Distance-To-a-Measure and Kernel Distance." Technical Report.

See Also

kde, dtm

Examples

Run this code
# Generate data from mixture of 2 normals.
n = 2000
X = c(rnorm(n/2), rnorm(n/2, mean=3, sd=1.2))

# Construct a grid of points over which we evaluate the function
by=0.02
Grid=seq(-3, 6, by=by)

## bandwidth for kernel density estimator
h=0.3  
## Bootstrap confidence band
band= bootstrapBand(X, kde, Grid, B=100, parallel=FALSE, alpha=0.05, h=h)

plot(Grid,band$fun, type="l", lwd=2, ylim=c(0, max(band$band)), 
     main="kde with 0.95 confidence band")
lines(Grid, pmax(band$band[,1],0), col=2, lwd=2)
lines(Grid, band$band[,2], col=2, lwd=2)

Run the code above in your browser using DataLab