Unsupervised clustering of cells is a common step in many single-cell
expression workflows. In an experiment containing a mixture of cell types,
each cluster might correspond to a different cell type. This function takes
a cell_data_set as input, clusters the cells using Louvain community
detection, and returns a cell_data_set with internally stored cluster
assignments. In addition to clusters this function calculates partitions,
which represent superclusters of the Louvain communities that are found
using a kNN pruning method. Cluster assignments can be accessed using the
clusters
function and partition assignments can be
accessed using the partitions
function.
cluster_cells(cds, reduction_method = c("UMAP", "tSNE", "PCA", "LSI"),
k = 20, louvain_iter = 1, partition_qval = 0.05, weight = FALSE,
resolution = NULL, random_seed = 0L, verbose = F, ...)
The cell_data_set upon which to perform clustering.
The dimensionality reduction method upon which to base clustering. Options are "UMAP", "tSNE", "PCA" and "LSI".
Integer number of nearest neighbors to use when creating the k nearest neighbor graph for Louvain clustering. k is related to the resolution of the clustering result, a bigger k will result in lower resolution and vice versa. Default is 20.
Integer number of iterations used for Louvain clustering. The clustering result giving the largest modularity score will be used as the final clustering result. Default is 1. Note that if louvain_iter is greater than 1, the random_seed argument will be ignored.
Numeric, the q-value cutoff to determine when to partition. Default is 0.05.
A logical argument to determine whether or not to use Jaccard coefficients for two nearest neighbors (based on the overlapping of their kNN) as the weight used for Louvain clustering. Default is FALSE.
Parameter that controls the resolution of clustering. If NULL (Default), the parameter is determined automatically.
The seed used by the random number generator in louvain-igraph package. This argument will be ignored if louvain_iter is larger than 1.
A logic flag to determine whether or not we should print the run details.
Additional arguments passed to louvain-igraph Python package.
an updated cell_data_set object, with cluster and partition
information stored internally and accessible using
clusters
and partitions
Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. Science, 344(6191), 1492-1496. doi:10.1126/science.1242072
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre: Fast unfolding of communities in large networks. J. Stat. Mech. (2008) P10008
Jacob H. Levine and et. al. Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell, 2015.