Learn R Programming

FastKRR (version 0.1.2)

approx_kernel: Compute low-rank approximations(Nyström, Pivoted Cholesky, RFF)

Description

Computes low-rank kernel approximation \(\tilde{K} \in \mathbb{R}^{n \times n}\)using three methods: Nyström approximation, Pivoted Cholesky decomposition, and Random Fourier Features (RFF).

Usage

approx_kernel(
  K = NULL,
  X = NULL,
  opt = c("nystrom", "pivoted", "rff"),
  kernel = c("gaussian", "laplace"),
  m = NULL,
  d,
  rho,
  eps = 1e-06,
  W = NULL,
  b = NULL,
  n_threads = 4
)

Value

An S3 object of class "approx_kernel" containing the results of the kernel approximation:

  • call: The matched function call used to create the object.

  • opt: The kernel approximation method actually used ("nystrom", "pivoted", "rff").

  • K_approx: \(n \times n\) approximated kernel matrix.

  • m: Kernel approximation degree.

Additional components depend on the value of opt:

nystrom

  • n_threads: Number of threads used in the computation.

pivoted

  • eps: Numerical tolerance used for early stopping in the pivoted Cholesky decomposition.

rff

  • d: Input design matrix's dimension.

  • rho: Scaling parameter of the kernel.

  • W: \(m \times d\) Random frequency matrix.

  • b: Random phase \(m\)--vector.

  • used_supplied_Wb: Logical; TRUE if user-supplied W, b were used, FALSE otherwise.

  • n_threads: Number of threads used in the computation.

Arguments

K

Exact Kernel matrix \(K \in \mathbb{R}^{n \times n}\). Used in "nystrom" and "pivoted".

X

Design matrix \(X \in \mathbb{R}^{n \times d}\). Only required for "rff".

opt

Method for constructing or approximating :

"nystrom"

Construct a low-rank approximation of the kernel matrix \(K \in \mathbb{R}^{n \times n}\) using the Nyström approximation.

"pivoted"

Construct a low-rank approximation of the kernel matrix \(K \in \mathbb{R}^{n \times n}\) using Pivoted Cholesky decomposition.

"rff"

Construct a low-rank approximation of the kernel matrix \(K \in \mathbb{R}^{n \times n}\) using Random Fourier Features (RFF).

kernel

Kernel type either "gaussian"or "laplace".

m

Approximation rank (number of random features) for the low-rank kernel approximation. If not specified, the recommended choice is $$\lceil n \cdot \log(d + 5) / 10 \rceil$$ where \(X\) is design matrix, \(n = nrow(X)\) and \(d = ncol(X)\).

d

Design matrix's dimension (\(d = ncol(X)\)).

rho

Scaling parameter of the kernel (\(\rho\)), specified by the user.

eps

Tolerance parameter used only in "pivoted" for stopping criterion of the Pivoted Cholesky decomposition.

W

Random frequency matrix \(\omega \in \mathbb{R}^{m \times d}\)

b

Random phase vector \(b \in \mathbb{R}^m\), i.i.d. \(\mathrm{Unif} [ 0,\,2\pi ]\).

n_threads

Number of parallel threads. The default is 4. If the system does not support 4 threads, it automatically falls back to 1 thread. It is applied only for opt = "nystrom" or opt = "rff" , and for the Laplace kernel (kernel = "laplace").

Details

Requirements and what to supply:

Common

  • d and rho must be provided (non-NULL).

nystrom / pivoted

  • Require a precomputed kernel matrix K; error if K is NULL.

  • If m is NULL, use \(\lceil n \cdot \log(d + 5) / 10 \rceil\).

  • For "pivoted", a tolerance eps is used; the decomposition stops early when the next pivot (residual diagonal) drops below eps.

rff

  • K must be NULL (not used) and X must be provided with d = ncol(X).

  • The function automatically generates W (random frequency matrix \(\omega \in \mathbb{R}^{m \times d}\)) and b (random phase vector \(b \in \mathbb{R}^{m}\)).

  • If the user provides them manually, both W and b must be specified and their dimensions must be compatible.

Examples

Run this code
# Data setting
set.seed(1)
d = 1
n = 1000
m = 50
X = matrix(runif(n*d, 0, 1), nrow = n, ncol = d)
y = as.vector(sin(2*pi*rowMeans(X)^3) + rnorm(n, 0, 0.1))
K = make_kernel(X, kernel = "gaussian", rho = 1)

# Example: RFF approximation
K_rff = approx_kernel(X = X, opt = "rff", kernel = "gaussian",
                      m = m, d = d, rho = 1,
                      n_threads = 1)

# Exapmle: Nystrom approximation
K_nystrom = approx_kernel(K = K, opt = "nystrom",
                          m = m, d = d, rho = 1,
                          n_threads = 1)

# Example: Pivoted Cholesky approximation
K_pivoted = approx_kernel(K = K, opt = "pivoted",
                          m = m, d = d, rho = 1)

Run the code above in your browser using DataLab