Learn R Programming

TDAvec (version 0.1.41)

computePersistenceSilhouette: A Vector Summary of the Persistence Silhouette Function

Description

For a given persistence diagram \(D=\{(b_i,d_i)\}_{i=1}^N\) (corresponding to a specified homological dimension), computePersistenceSilhouette() vectorizes the \(p\)th power persistence silhouette function $$\phi_p(t) = \frac{\sum_{i=1}^N |d_i-b_i|^p\Lambda_i(t)}{\sum_{i=1}^N |d_i-b_i|^p},$$ where $$\Lambda_i(t) = \left\{ \begin{array}{ll} t-b_i & \quad t\in [b_i,\frac{b_i+d_i}{2}] \\ d_i-t & \quad t\in (\frac{b_i+d_i}{2},d_i]\\ 0 & \quad \hbox{otherwise} \end{array} \right.$$ based on a scale sequence scaleSeq. The evaluation method depends on the argument evaluate. Points in \(D\) with infinite death values are ignored.

Usage

computePersistenceSilhouette(D, homDim, scaleSeq, p = 1.0, evaluate = "intervals")

Value

A numeric vector containing elements computed using scaleSeq=\(\{t_1,t_2,\ldots,t_n\}\) according to the method specified by evaluate.

  • "intervals": Computes average values of the persistence silhouette function over intervals defined by consecutive elements in scaleSeq:

    $$\Big(\frac{1}{\Delta t_1}\int_{t_1}^{t_2}\phi_p(t)dt,\frac{1}{\Delta t_2}\int_{t_2}^{t_3}\phi_p(t)dt,\ldots,\frac{1}{\Delta t_{n-1}}\int_{t_{n-1}}^{t_n}\phi_p(t)dt\Big)\in\mathbb{R}^{n-1},$$ where \(\Delta t_k=t_{k+1}-t_k\).

  • "points": Computes values of the persistence silhouette function at each point in scaleSeq:

    $$(\phi_p(t_1),\phi_p(t_2),\ldots,\phi_p(t_n))\in\mathbb{R}^n.$$

Arguments

D

a persistence diagram: a matrix with three columns containing the homological dimension, birth and death values respectively.

homDim

the homological dimension (0 for \(H_0\), 1 for \(H_1\), etc.). Rows in D are filtered based on this value.

scaleSeq

a numeric vector of increasing scale values used for vectorization.

p

power of the weights for the silhouette function. By default, p=1.

evaluate

a character string indicating the evaluation method. Must be either "intervals" (default) or "points".

Author

Umar Islambekov

Details

The function extracts rows from D where the first column equals homDim, and computes values based on the filtered data and scaleSeq. If D does not contain any points corresponding to homDim, a vector of zeros is returned.

References

1. Chazal, F., Fasy, B. T., Lecci, F., Rinaldo, A., & Wasserman, L. (2014). Stochastic convergence of persistence landscapes and silhouettes. In Proceedings of the thirtieth annual symposium on Computational geometry (pp. 474-483).

Examples

Run this code
N <- 100 # The number of points to sample

set.seed(123) # Set a random seed for reproducibility

# Sample N points uniformly from the unit circle and add Gaussian noise
theta <- runif(N, min = 0, max = 2 * pi)
X <- cbind(cos(theta), sin(theta)) + rnorm(2 * N, mean = 0, sd = 0.2)

# Compute the persistence diagram using the Rips filtration built on top of X
# The 'threshold' parameter specifies the maximum distance for building simplices
D <- TDAstats::calculate_homology(X, threshold = 2)

scaleSeq = seq(0, 2, length.out = 11) # A sequence of scale values

# Compute a vector summary of the persistence silhouette (with p=1) for homological dimension H_0
computePersistenceSilhouette(D, homDim = 0, scaleSeq)

# Compute a vector summary of the persistence silhouette (with p=1) for homological dimension H_1
computePersistenceSilhouette(D, homDim = 1, scaleSeq)

Run the code above in your browser using DataLab