Learn R Programming

mp (version 0.4.1)

tSNE: t-Distributed Stochastic Neighbor Embedding

Description

Creates a k-dimensional representation of the data by modeling the probability of picking neighbors using a Gaussian for the high-dimensional data and t-Student for the low-dimensional map and then minimizing the KL divergence between them. This implementation uses the same default parameters as defined by the authors.

Usage

tSNE(X, Y = NULL, k = 2, perplexity = 30, n.iter = 1000, eta = 500, initial.momentum = 0.5, final.momentum = 0.8, early.exaggeration = 4, gain.fraction = 0.2, momentum.threshold.iter = 20, exaggeration.threshold.iter = 100, max.binsearch.tries = 50)

Arguments

X
A data frame, data matrix, dissimilarity (distance) matrix or dist object.
Y
Initial k-dimensional configuration. If NULL, the method uses a random initial configuration.
k
Target dimensionality. Avoid anything other than 2 or 3.
perplexity
A rough upper bound on the neighborhood size.
n.iter
Number of iterations to perform.
eta
The "learning rate" for the cost function minimization
initial.momentum
The initial momentum used before changing
final.momentum
The momentum to use on remaining iterations
early.exaggeration
The early exaggeration applied to intial iterations
gain.fraction
Undocumented
momentum.threshold.iter
Number of iterations before using the final momentum
exaggeration.threshold.iter
Number of iterations before using the real probabilities
max.binsearch.tries
Maximum number of tries in binary search for parameters to achieve the target perplexity

Value

The k-dimensional representation of the data.

References

L.J.P. van der Maaten and G.E. Hinton. _Visualizing High-Dimensional Data Using t-SNE._ Journal of Machine Learning Research 9(Nov): 2579-2605, 2008.

Examples

Run this code
# Iris example
emb <- tSNE(iris[, 1:4])
plot(emb, col=iris$Species)

Run the code above in your browser using DataLab