Learn R Programming

TDA (version 1.0)

bottleneckInterval: Bootstrapped Confidence Set for a Persistence Diagram, using the Bottlenck Distance.

Description

bottleneckInterval computes a (1-alpha) confidence set for the Persistence Diagram of a filtration of sublevel sets (or superlevel sets) of a function evaluated over a grid of points in dimension $d=$1, 2 or 3. The function returns the (1-alpha) quantile of B bottleneck distances, computed in B iterations of the bootstrap algorithm. The method is discussed in the 1st reference.

Usage

bottleneckInterval(X, FUN, Xlim, Ylim = NA, Zlim = NA, by=(Xlim[2]-Xlim[1])/20, 
         sublevel = TRUE, B=30, alpha=0.05, dimension=1, printStatus = FALSE, ...)

Arguments

X
an $n$ by $d$ matrix of coordinates, used by the function FUN, where $n$ is the number of points stored in X and $d$ is the dimension (1, 2 or 3).
FUN
a function whose inputs are 1) an $n$ by $d$ matrix of coordinates X, 2) an $m$ by $d$ matrix of coordinates Grid, 3) an optional smoothing parameter, and returns a numeric vector of length $m$. For example see
Xlim
a numeric vector of length 2, specifying the range of the first dimension of the grid, over which the function FUN is evaluated.
Ylim
a numeric vector of length 2, specifying the range of the second dimension of the grid, over which the function FUN is evaluated. NA for a 1 dimensional grid.
Zlim
a numeric vector of length 2, specifying the range of the third dimension of the grid, over which the function FUN is evaluated. NA for a 1 dimensional or 2 dimensional grid.
by
number: space between points of the grid in each dimension.
sublevel
a logical variable indicating if the Persistence Diagram should be computed for sublevel sets (TRUE) or superlevel sets (FALSE) of the function. Default is TRUE.
B
the number of bootstrap iterations.
alpha
bottleneckInterval returns a (1-alpha) quantile.
dimension
an integer specifying the dimension of the features used to compute the bottleneck distance. 0 for connected components, 1 for loops, 2 for voids.
printStatus
if TRUE a progress bar is printed. Default is FALSE.
...
additional parameters for the function FUN.

Value

  • Returns the (1-alpha) quantile of the values computed by the bootstrap algorithm. It corresponds to half of width of the confidence set for the persistence diagram.

Details

bottleneckInterval uses gridDiag to compute the persistence diagram of the input function using the entire sample. Then the bootstrap algorithm, for B times, computes the bottleneck distance between the original persistence diagram and the one computed using a subsample. Finally the (1-alpha) quantile of these B values is returned.

References

Chazal F, Fasy BT, Lecci F, Michel B, Rinaldo A, Wasserman L (2014). "Robust Topological Inference: Distance-To-a-Measure and Kernel Distance." Technical Report.

Larry Wasserman (2004), "All of statistics: a concise course in statistical inference", Springer.

http://www.mrzv.org/software/dionysus/

See Also

bottleneck, bootstrapBand, distFct, kde, kernelDist, dtm, summary.diagram, plot.diagram,

Examples

Run this code
## confidence set for the Kernel Density Diagram

# input data
n = 400
XX = circleUnif(n)

## Ranges of the grid
Xlim=c(-1.8,1.8)
Ylim=c(-1.6,1.6)
by=0.05

h = .3  #bandwidth for the function kde

#Kernel Density Diagram of the superlevel sets
Diag=gridDiag(XX, kde, Xlim, Ylim, by=by, sublevel=FALSE, printStatus=TRUE, h=h) 

# confidence set
B=15         ## the number of bootstrap iterations should be higher! 
             ## this is just an example
alpha=0.05

cc=bottleneckInterval(XX, kde, Xlim, Ylim, by=by, sublevel=FALSE, B=B, alpha=alpha, 
   dimension=1, printStatus=TRUE, h=h)

plot(Diag, band=2*cc)

Run the code above in your browser using DataLab