# Spectrum

##### Spectrum: Versatile ultra-fast spectral clustering for single and multi-view data

Spectrum is a fast adaptive spectral clustering method for single or multi-view data. Spectrum uses a new type of adaptive density aware kernel that strengthens local connections in the graph. For integrating multi-view data and reducing noise a tensor product graph data integration and diffusion procedure is used. Spectrum contains two approaches for finding the number of clusters (K); the classical eigengap method and a novel multimodality gap method. The multimodality gap analyses the distribution of the eigenvectors of the graph Laplacian to decide K and can be used to tune the kernel.

##### Usage

```
Spectrum(data, method = 1, silent = FALSE, showres = TRUE,
diffusion = TRUE, kerneltype = c("density", "stsc"), maxk = 10,
NN = 3, NN2 = 7, showpca = FALSE, showheatmap = FALSE,
showdimred = FALSE, visualisation = c("umap", "tsne"), frac = 2,
thresh = 7, fontsize = 18)
```

##### Arguments

- data
Data frame or list of data frames: contains the data with samples as columns and rows as features. For multi-view data a list of dataframes is to be supplied with the samples in the same order.

- method
Numerical value: 1 = default eigengap method (Gaussian clusters), 2 = multimodality gap method (Gaussian/ non-linear clusters)

- silent
Logical flag: whether to turn off messages

- showres
Logical flag: whether to show the results on the screen

- diffusion
Logical flag: whether to perform graph diffusion to reduce noise and boost performance, usually recommended

- kerneltype
Character string: 'density' (default) = adaptive density aware kernel, 'stsc' = Zelnik-Manor self-tuning kernel

- maxk
Numerical value: the maximum number of expected clusters (default = 10). This is data dependent - do not set excessively high.

- NN
Numerical value: kernel param, the number of nearest neighbours to use sigma parameters (default = 3)

- NN2
Numerical value: kernel param, the number of nearest neighbours to use for the common nearest neigbours (default = 7)

- showpca
Logical flag: whether to show pca when running on one view

- showheatmap
Logical flag: whether to show heatmap of affinity matrix when running on one view

- showdimred
Logical flag: whether to show UMAP or t-SNE of final affinity matrix

- visualisation
Character string: what kind of dimensionality reduction to run on the affinity matrix (umap or tsne)

- frac
Numerical value: optk search param, fraction to find the last substantial drop (multimodality gap method param)

- thresh
Numerical value: optk search param, how many points ahead to keep searching (multimodality gap method param)

- fontsize
Numerical value: controls font size of the ggplot2 plots

##### Value

A list, containing: 1) cluster assignments, in the same order as input data columns 2) eigenvector analysis results (either eigenvalues or dip test statistics) 3) optimal K 4) final affinity matrix 5) eigenvectors and eigenvalues of graph Laplacian

##### Examples

```
# NOT RUN {
res <- Spectrum(brain[[1]][,1:50])
# }
```

*Documentation reproduced from package Spectrum, version 0.2, License: AGPL-3*