HclustDepart: Cluster cells in a recursive way

Description

This function returns a list with clustering results.

Usage

HclustDepart(data, maxSplit = 10, minSize = 10, sim = 100, ...)
# S3 method for scppp
HclustDepart(data, maxSplit = 10, minSize = 10, sim = 100, ...)
# S3 method for matrix
HclustDepart(data, maxSplit = 10, minSize = 10, sim = 100, ...)

Value

A list with the following elements:

res2: a data frame containing two columns: names (cell names) and clusters (cluster label)
sigclust_p: a matrix with cells as rows and split indices as columns; the entry in row i and column j denotes the p-value for cell i at split step j
sigclust_z: a matrix with cells as rows and split indices as columns; the entry in row i and column j denotes the z-score for cell i at split step j

If the input is an S3 object for class 'scppp', clustering result will be stored in object scppp under "clust_results".

Arguments

data: A UMI count matrix with genes as rows and cells as columns or an S3 object for class 'scppp'.
maxSplit: A numeric value specifying the maximum allowable number of splitting steps (default 10).
minSize: A numeric value specifying the minimal allowable cluster size (the number of cells for the smallest cluster, default 10).
sim: A numeric value specifying the number of simulations during the Monte Carlo simulation procedure for statistical significance test, i.e. n_sim argument when apply sigclust2 (default = 100).
...: not used.

Details

This is a function used to get cell clustering results in a recursive way. At each step, the two-way approximation is re-calculated again within each subcluster, and the potential for further splitting is calculated using sigclust2. A non significant result suggests cells are reasonably homogeneous and may come from the same cell type. In addition, to avoid over splitting, the maximum allowable number of splitting steps maxSplit (default is 10, which leads to at most \(2^{10} = 1024\) total number of clusters) and minimal allowable cluster size minSize (the number of cells in a cluster allowed for further splitting, default is 10) may be set beforehand. Thus the process is stopped when any of the conditions is satisfied: (1) the split is no longer statistically significant; (2) the maximum allowable number of splitting steps is reached; (3) any current cluster has less than 10 cells.

Examples

Run this code


test_set <- matrix(rpois(500, 0.5), nrow = 10)
HclustDepart(test_set)

Run the code above in your browser using DataLab