tSNE-class: t-Distributed Stochastic Neighborhood Embedding

Description

An S4 Class for t-SNE.

Arguments

Slots

fun: A function that does the embedding and returns a dimRedResult object.

stdpars

The standard parameters for the function.

General usage

Dimensionality reduction methods are S4 Classes that either be used directly, in which case they have to be initialized and a full list with parameters has to be handed to the @fun() slot, or the method name be passed to the embed function and parameters can be given to the ..., in which case missing parameters will be replaced by the ones in the @stdpars.

Parameters

t-SNE can take the following parameters:

d: A distance function, defaults to euclidean distances
perplexity: The perplexity parameter, roughly equivalent to neighborhood size.
theta: Approximation for the nearest neighbour search, large values are more inaccurate.
ndim: The number of embedding dimensions.

Implementation

Wraps around Rtsne, which is very well documented. Setting theta = 0 does a normal t-SNE, larger values for theta < 1 use the Barnes-Hut algorithm which scales much nicer with data size. Larger values for perplexity take larger neighborhoods into account.

Details

t-SNE is a method that uses Kullback-Leibler divergence between the distance matrices in high and low-dimensional space to embed the data. The method is very well suited to visualize complex structures in low dimensions.

References

Maaten, L. van der, 2014. Accelerating t-SNE using Tree-Based Algorithms. Journal of Machine Learning Research 15, 3221-3245.

van der Maaten, L., Hinton, G., 2008. Visualizing Data using t-SNE. J. Mach. Learn. Res. 9, 2579-2605.

Examples

Run this code

# NOT RUN {
dat <- loadDataSet("3D S Curve", n = 300)

## using the S4 class directly:
tsne <- tSNE()
emb <- tsne@fun(dat, tsne@stdpars)

## using embed()
emb2 <- embed(dat, "tSNE", perplexity = 80)

plot(emb, type = "2vars")
plot(emb2, type = "2vars")
# }

Run the code above in your browser using DataLab