Learn R Programming

TDAvec (version 0.1.41)

computeTemplateFunction: Compute a Vectorization of a Persistence Diagram based on Tent Template Functions

Description

For a given persistence diagram \(D=\{(b_i,d_i)\}_{i=1}^N\) (corresponding to a specified homological dimension), computeTemplateFunction() computes a vectorization using a collection of tent template functions defined by

$$G_{(b,p),\delta}(D) = \sum_{i=1}^N\max \left\{ 0, 1 - \frac{1}{\delta} \max \left( |b_i - b|, |p_i - p| \right) \right\},$$ where \(p_i=d_i-b_i\) (persistence), \(b\geq 0\), \(p>0\) and \(0<\delta<p\). The point \((b,p)\) is referred to as the center. Points in \(D\) with infinite death values are ignored.

Usage

computeTemplateFunction(D, homDim, delta, d, epsilon)

Value

A numeric vector of dimension \((d+1)d\), containing the values of the tent template functions centered at the grid points \(\{(\delta i, \delta j + \epsilon)\}_{i=0,j=1}^{d,d}\): $$ \{ G_{(\delta i, \delta j + \epsilon), \delta}(D) \mid 0 \leq i \leq d, \, 1 \leq j \leq d \}. $$

When one-dimensional tent template functions are used, the returned vector has a dimension of \(d\): $$ \{ G_{\delta j + \epsilon, \delta}(D) \mid 1 \leq j \leq d \}. $$

Arguments

D

a persistence diagram: a matrix with three columns containing the homological dimension, birth and death values respectively.

homDim

the homological dimension (0 for \(H_0\), 1 for \(H_1\), etc.). Rows in D are filtered based on this value.

delta

a positive scalar representing the increment size used in the computation of the template function.

d

a positive integer specifying the number of bins along each axis in the grid.

epsilon

a positive scalar indicating the vertical shift applied to the grid.

Author

Umar Islambekov

Details

The function extracts rows from D where the first column equals homDim, and computes the tent template function on a discretized grid determined by delta, d, and epsilon. The number of tent functions is controlled by d. The value of \(\delta\) is chosen such that the box \([0, \delta d] \times [\epsilon,\delta d + \epsilon]\) contains all the points of the diagrams considered in the birth-persistence plane. \(\epsilon\) should be smaller than the minimum persistence value across all the diagrams under consideration. If D does not contain any points corresponding to homDim, a vector of zeros is returned.

If homDim=0 and all the birth values are equal (e.g., zero), one-dimensional tent template functions are used instead for vectorization:

$$G_{p,\delta}(D) = \sum_{i=1}^N\max (0, 1 - \frac{|p_i - p|}{\delta}).$$

References

1. Perea, J.A., Munch, E. and Khasawneh, F.A., (2023). Approximating continuous functions on persistence diagrams using template functions. Foundations of Computational Mathematics, 23(4), pp.1215-1272.

Examples

Run this code
N <- 100 # The number of points to sample

set.seed(123) # Set a random seed for reproducibility

# Sample N points uniformly from the unit circle and add Gaussian noise
theta <- runif(N, min = 0, max = 2 * pi)
X <- cbind(cos(theta), sin(theta)) + rnorm(2 * N, mean = 0, sd = 0.2)

# Compute the persistence diagram using the Rips filtration built on top of X
# The 'threshold' parameter specifies the maximum distance for building simplices
D <- TDAstats::calculate_homology(X, threshold = 2)

# Compute a vectorizaton based on tent template functions for homological dimension H_0
computeTemplateFunction(D, homDim = 0, delta = 0.1, d = 20, epsilon = 0.01)

# Compute a vectorizaton based on tent template functions for homological dimension H_1
computeTemplateFunction(D, homDim = 1, delta = 0.1, d = 9, epsilon = 0.01)

Run the code above in your browser using DataLab