tSNE: t-Distributed Stochastic Neighbor Embedding

Description

Creates a k-dimensional representation of the data by modeling the probability of picking neighbors using a Gaussian for the high-dimensional data and t-Student for the low-dimensional map and then minimizing the KL divergence between them. This implementation uses the same default parameters as defined by the authors.

Usage

tSNE(X, Y = NULL, k = 2, perplexity = 30, n.iter = 1000, eta = 500, initial.momentum = 0.5, final.momentum = 0.8, early.exaggeration = 4, gain.fraction = 0.2, momentum.threshold.iter = 20, exaggeration.threshold.iter = 100, max.binsearch.tries = 50)

Arguments

A data frame, data matrix, dissimilarity (distance) matrix or dist object.

Initial k-dimensional configuration. If NULL, the method uses a random initial configuration.

Target dimensionality. Avoid anything other than 2 or 3.

perplexity

A rough upper bound on the neighborhood size.

n.iter

Number of iterations to perform.

eta

The "learning rate" for the cost function minimization

initial.momentum

The initial momentum used before changing

final.momentum

The momentum to use on remaining iterations

early.exaggeration

The early exaggeration applied to intial iterations

gain.fraction

Undocumented

momentum.threshold.iter

Number of iterations before using the final momentum

exaggeration.threshold.iter

Number of iterations before using the real probabilities

max.binsearch.tries

Maximum number of tries in binary search for parameters to achieve the target perplexity

Value

The k-dimensional representation of the data.

References

L.J.P. van der Maaten and G.E. Hinton. _Visualizing High-Dimensional Data Using t-SNE._ Journal of Machine Learning Research 9(Nov): 2579-2605, 2008.

Examples

Run this code

# Iris example
emb <- tSNE(iris[, 1:4])
plot(emb, col=iris$Species)

Run the code above in your browser using DataLab