Learn R Programming

TDA (version 1.3)

bootstrapDiagram: Bootstrapped Confidence Set for a Persistence Diagram, using the Bottleneck Distance (or the Wasserstein distance).

Description

bootstrapDiagram computes a (1-alpha) confidence set for the Persistence Diagram of a filtration of sublevel sets (or superlevel sets) of a function evaluated over a grid of points. The function returns the (1-alpha) quantile of B bottleneck distances (or Wasserstein distances), computed in B iterations of the bootstrap algorithm. The method is discussed in the 1st reference.

Usage

bootstrapDiagram(X, FUN, lim, by, sublevel = TRUE, library="Dionysus",
                 B=30, alpha=0.05, distance="bottleneck", dimension=1,
                 p=1, printProgress = FALSE, ...)

Arguments

X
an $n$ by $d$ matrix of coordinates, used by the function FUN, where $n$ is the number of points stored in X and $d$ is the dimension of the space.
FUN
a function whose inputs are 1) an $n$ by $d$ matrix of coordinates X, 2) an $m$ by $d$ matrix of coordinates Grid, 3) an optional smoothing parameter, and returns a numeric vector of length $m$. For example see
lim
a $2$ by $d$ matrix, where each column specifies the range of each dimension of the grid, over which the function FUN is evaluated.
by
either a number or a vector of length $d$ specifying space between points of the grid in each dimension. If a number is given, then same space is used in each dimension.
sublevel
a logical variable indicating if the Persistence Diagram should be computed for sublevel sets (TRUE) or superlevel sets (FALSE) of the function. Default is TRUE.
library
The user can compute the persistence diagram using either the library 'Dionysus', or 'PHAT'. Default is 'Dionysus'.
B
the number of bootstrap iterations.
alpha
bootstrapDiagram returns a (1-alpha) quantile.
distance
a string specifying the distance to be used for persistence diagrams: either 'bottleneck' or 'wasserstein'
dimension
if distance=="wasserstein", then dimension an integer specifying the dimension of the features used to compute the bottleneck distance. 0 for connected components, 1 for loops, 2 for voids and so on.
p
if distance=="wasserstein", then p is an integer specifying the power to be used in the computation of the Wasserstein distance. Default is 1.
printProgress
if TRUE a progress bar is printed. Default is FALSE.
...
additional parameters for the function FUN.

Value

  • Returns the (1-alpha) quantile of the values computed by the bootstrap algorithm.

Details

bootstrapDiagram uses gridDiag to compute the persistence diagram of the input function using the entire sample. Then the bootstrap algorithm, for B times, computes the bottleneck distance between the original persistence diagram and the one computed using a subsample. Finally the (1-alpha) quantile of these B values is returned.

References

Chazal F, Fasy BT, Lecci F, Michel B, Rinaldo A, Wasserman L (2014). "Robust Topological Inference: Distance-To-a-Measure and Kernel Distance." Technical Report.

Larry Wasserman (2004), "All of statistics: a concise course in statistical inference", Springer.

Dmitriy Morozov, "Dionysus, a C++ library for computing persistent homology". http://www.mrzv.org/software/dionysus/

See Also

bottleneck, bootstrapBand, distFct, kde, kernelDist, dtm, summary.diagram, plot.diagram,

Examples

Run this code
## confidence set for the Kernel Density Diagram

# input data
n = 400
XX = circleUnif(n)

## Ranges of the grid
Xlim=c(-1.8,1.8)
Ylim=c(-1.6,1.6)
lim=cbind(Xlim, Ylim)
by=0.05

h = .3  #bandwidth for the function kde

#Kernel Density Diagram of the superlevel sets
Diag=gridDiag(XX, kde, lim=lim, by=by, sublevel=FALSE, printProgress=TRUE, h=h) 

# confidence set
B=10         ## the number of bootstrap iterations should be higher! 
             ## this is just an example
alpha=0.05

cc=bootstrapDiagram(XX, kde, lim=lim, by=by, sublevel=FALSE, B=B, alpha=alpha, 
   dimension=1, printProgress=TRUE, h=h)

plot(Diag$diagram, band=2*cc)

Run the code above in your browser using DataLab