UMAP

```
dimred_umap(
x,
ndim = 2,
distance_method = c("euclidean", "cosine", "manhattan"),
pca_components = 50,
n_neighbors = 15L,
init = "spectral",
n_threads = 1
)
```

x

Log transformed expression data, with rows as cells and columns as features

ndim

The number of dimensions

distance_method

The name of the distance metric, see dynutils::calculate_distance

pca_components

The number of pca components to use for UMAP. If NULL, PCA will not be performed first

n_neighbors

The size of local neighborhood (in terms of number of neighboring sample points).

init

Type of initialization for the coordinates. Options are:

`"spectral"`

Spectral embedding using the normalized Laplacian of the fuzzy 1-skeleton, with Gaussian noise added.`"normlaplacian"`

. Spectral embedding using the normalized Laplacian of the fuzzy 1-skeleton, without noise.`"random"`

. Coordinates assigned using a uniform random distribution between -10 and 10.`"lvrandom"`

. Coordinates assigned using a Gaussian distribution with standard deviation 1e-4, as used in LargeVis (Tang et al., 2016) and t-SNE.`"laplacian"`

. Spectral embedding using the Laplacian Eigenmap (Belkin and Niyogi, 2002).`"pca"`

. The first two principal components from PCA of`X`

if`X`

is a data frame, and from a 2-dimensional classical MDS if`X`

is of class`"dist"`

.`"spca"`

. Like`"pca"`

, but each dimension is then scaled so the standard deviation is 1e-4, to give a distribution similar to that used in t-SNE. This is an alias for`init = "pca", init_sdev = 1e-4`

.`"agspectral"`

An "approximate global" modification of`"spectral"`

which all edges in the graph to a value of 1, and then sets a random number of edges (`negative_sample_rate`

edges per vertex) to 0.1, to approximate the effect of non-local affinities.A matrix of initial coordinates.

For spectral initializations, (`"spectral"`

, `"normlaplacian"`

,
`"laplacian"`

), if more than one connected component is identified,
each connected component is initialized separately and the results are
merged. If `verbose = TRUE`

the number of connected components are
logged to the console. The existence of multiple connected components
implies that a global view of the data cannot be attained with this
initialization. Either a PCA-based initialization or increasing the value of
`n_neighbors`

may be more appropriate.

n_threads

Number of threads to use (except during stochastic gradient
descent). Default is half the number of concurrent threads supported by the
system. For nearest neighbor search, only applies if
`nn_method = "annoy"`

. If `n_threads > 1`

, then the Annoy index
will be temporarily written to disk in the location determined by
`tempfile`

.

```
# NOT RUN {
library(Matrix)
dataset <- abs(Matrix::rsparsematrix(100, 100, .5))
dimred_umap(dataset, ndim = 2, pca_components = NULL)
# }
```

Run the code above in your browser using DataCamp Workspace