projectKNNs: Project a distance matrix into a lower-dimensional space. (from elbamos/largeVis)

Description

Takes as input a sparse matrix of the edge weights connecting each node to its nearest neighbors, and outputs a matrix of coordinates embedding the inputs in a lower-dimensional space.

Usage

projectKNNs(
  wij,
  dim = 2,
  sgd_batches = NULL,
  M = 5,
  gamma = 7,
  alpha = 1,
  rho = 1,
  coords = NULL,
  useDegree = FALSE,
  momentum = NULL,
  seed = NULL,
  threads = NULL,
  verbose = getOption("verbose", TRUE)
)

Value

A dense [N,D] matrix of the coordinates projecting the w_ij matrix into the lower-dimensional space.

Arguments

wij: A symmetric sparse matrix of edge weights, in C-compressed format, as created with the Matrix package.
dim: numeric The number of dimensions for the projection space (default=2)
sgd_batches: numeric The number of edges to process during SGD (default=NULL). Defaults to a value set based on the size of the dataset. If the parameter given is between 0 and 1, the default value will be multiplied by the parameter.
M: numeric (largeVis) The number of negative edges to sample for each positive edge (default=5).
gamma: numeric (largeVis) The strength of the force pushing non-neighbor nodes apart (default=7).
alpha: numeric (largeVis) The hyperparameter in the distance function (default=1). The default distance function, $1 / (1 + \alpha \dot ||y_i - y_j||^2)$. The function relates the distance between points in the low-dimensional projection to the likelihood that the two points are nearest neighbors. Increasing $\alpha$ tends to push nodes and their neighbors closer together; decreasing $\alpha$ produces a broader distribution. Setting $\alpha$ to zero enables the alternative distance function. $\alpha$ below zero is meaningless.
rho: (largeVis) numeric Initial learning rate (default=1)
coords: An initialized coordinate matrix (default=NULL)
useDegree: boolean Whether to use vertex degree to determine weights in negative sampling (if TRUE) or the sum of the vertex's edges (if FALSE) (default=FALSE)
momentum: If not NULL, SGD with momentum is used, with this multiplier, which must be between 0 and 1 (default=NULL). Note that momentum can drastically speed-up training time, at the cost of additional memory consumed.
seed: numeric Random seed to be passed to the C++ functions (default=NULL). Sampled from hardware entropy pool if NULL (the default). Note that if the seed is not NULL (the default), the maximum number of threads will be set to 1 in phases of the algorithm that would otherwise be non-deterministic.
threads: numeric The maximum number of threads to spawn (default=NULL). Determined automatically if NULL (the default).
verbose: boolean Verbosity (default=getOption("verbose", TRUE))

Details

The algorithm attempts to estimate a dim-dimensional embedding using stochastic gradient descent and negative sampling.

The objective function is: $$ O = \sum_{(i,j)\in E} w_{ij} (\log f(||p(e_{ij} = 1||) + \sum_{k=1}^{M} E_{jk~P_{n}(j)} \gamma \log(1 - f(||p(e_{ij_k} - 1||)))$$ where $f()$ is a probabilistic function relating the distance between two points in the low-dimensional projection space, and the probability that they are nearest neighbors.

The default probabilistic function is $1 / (1 + \alpha \dot ||x||^2)$. If $\alpha$ is set to zero, an alternative probabilistic function, $1 / (1 + \exp(x^2))$ will be used instead.

Note that the input matrix should be symmetric. If any columns in the matrix are empty, the function will fail.