Learn R Programming

KPC (version 0.1.2)

Klin: A near linear time analogue of KMAc

Description

Calculate \(\hat{\eta}_n^{\mbox{lin}}\) (the unconditional version of graph-based KPC) using directed K-NN graph or minimum spanning tree (MST). The computational complexity is O(nlog(n))

Usage

Klin(
  Y,
  X,
  k = kernlab::rbfdot(1/(2 * stats::median(stats::dist(Y))^2)),
  Knn = 1
)

Value

The algorithm returns a real number `Klin': an empirical kernel measure of association which can be computed in near linear time when K-NN graphs are used.

Arguments

Y

a matrix of response (n by dy)

X

a matrix of predictors (n by dx)

k

a function \(k(y, y')\) of class kernel. It can be the kernel implemented in kernlab e.g. rbfdot(sigma = 1), vanilladot()

Knn

the number of K-nearest neighbor to use; or "MST". A small Knn (e.g., Knn=1) is recommended.

Details

\(\hat{\eta}_n\) is an estimate of the population kernel measure of association, based on data \(\{(X_i,Y_i)\}_{i=1}^n\) from \(\mu\). For K-NN graph, \(\hat{\eta}_n\) can be computed in near linear time (in \(n\)). In particular, $$\hat{\eta}_n^{\mbox{lin}}:=\frac{n^{-1}\sum_{i=1}^n d_i^{-1}\sum_{j:(i,j)\in\mathcal{E}(G_n)} k(Y_i,Y_j)-(n-1)^{-1}\sum_{i=1}^{n-1} k(Y_i,Y_{i+1})}{n^{-1}\sum_{i=1}^n k(Y_i,Y_i)-(n-1)^{-1}\sum_{i=1}^{n-1} k(Y_i,Y_{i+1})}$$, where all symbols have their usual meanings as in the definition of \(\hat{\eta}_n\). Euclidean distance is used for computing the K-NN graph and the MST.

References

Deb, N., P. Ghosal, and B. Sen (2020), “Measuring association on topological spaces using kernels and geometric graphs” <arXiv:2010.01768>.

See Also

KPCgraph, KMAc

Examples

Run this code
library(kernlab)
Klin(Y = rnorm(100), X = rnorm(100), k = rbfdot(1), Knn = 1)

Run the code above in your browser using DataLab