Spectrum is a fast adaptive spectral clustering method for single or multi-view data. Spectrum uses a new type of adaptive density aware kernel that strengthens local connections in the graph. For integrating multi-view data and reducing noise a tensor product graph data integration and diffusion procedure is used. Spectrum contains two approaches for finding the number of clusters (K); the classical eigengap method and a novel multimodality gap method. The multimodality gap analyses the distribution of the eigenvectors of the graph Laplacian to decide K and can be used to tune the kernel.
Spectrum(data, method = 1, silent = FALSE, showres = TRUE,
diffusion = TRUE, kerneltype = c("density", "stsc"), maxk = 10,
NN = 3, NN2 = 7, showpca = FALSE, showheatmap = FALSE,
showdimred = FALSE, visualisation = c("umap", "tsne"), frac = 2,
thresh = 7, fontsize = 18)
Data frame or list of data frames: contains the data with samples as columns and rows as features. For multi-view data a list of dataframes is to be supplied with the samples in the same order.
Numerical value: 1 = default eigengap method (Gaussian clusters), 2 = multimodality gap method (Gaussian/ non-linear clusters)
Logical flag: whether to turn off messages
Logical flag: whether to show the results on the screen
Logical flag: whether to perform graph diffusion to reduce noise and boost performance, usually recommended
Character string: 'density' (default) = adaptive density aware kernel, 'stsc' = Zelnik-Manor self-tuning kernel
Numerical value: the maximum number of expected clusters (default = 10). This is data dependent - do not set excessively high.
Numerical value: kernel param, the number of nearest neighbours to use sigma parameters (default = 3)
Numerical value: kernel param, the number of nearest neighbours to use for the common nearest neigbours (default = 7)
Logical flag: whether to show pca when running on one view
Logical flag: whether to show heatmap of affinity matrix when running on one view
Logical flag: whether to show UMAP or t-SNE of final affinity matrix
Character string: what kind of dimensionality reduction to run on the affinity matrix (umap or tsne)
Numerical value: optk search param, fraction to find the last substantial drop (multimodality gap method param)
Numerical value: optk search param, how many points ahead to keep searching (multimodality gap method param)
Numerical value: controls font size of the ggplot2 plots
A list, containing: 1) cluster assignments, in the same order as input data columns 2) eigenvector analysis results (either eigenvalues or dip test statistics) 3) optimal K 4) final affinity matrix 5) eigenvectors and eigenvalues of graph Laplacian
# NOT RUN {
res <- Spectrum(brain[[1]][,1:50])
# }
Run the code above in your browser using DataLab