Learn R Programming

1E

Coarse-graining of large single-cell RNA-seq data into metacells

SuperCell is an R package for coarse-graining large single-cell RNA-seq data into metacells and performing downstream analysis at the metacell level.

The exponential scaling of scRNA-seq data represents an important hurdle for downstream analyses. One of the solutions to facilitate the analysis of large-scale and noisy scRNA-seq data is to merge transcriptionally highly similar cells into metacells. This concept was first introduced by Baran et al., 2019 (MetaCell) and by Iacono et al., 2018 (bigSCale). More recent methods to build metacells have been described in Ben-Kiki et al. 2022 (MetaCell2), Bilous et al., 2022 (SuperCell) and Persad et al., 2022 (SEACells). Despite some differences in the implementation, all the methods are network-based and can be summarized as follows:

1. A single-cell network is computed based on cell-to-cell similarity (in transcriptomic space)

2. Highly similar cells are identified as those forming dense regions in the single-cell network and merged together into metacells (coarse-graining)

3. Transcriptomic information within each metacell is combined (average or sum).

4. Metacell data are used for the downstream analyses instead of large-scale single-cell data

Unlike clustering, the aim of metacells is not to identify large groups of cells that comprehensively capture biological concepts, like cell types, but to merge cells that share highly similar profiles, and may carry repetitive information. Therefore metacells represent a compromise structure that optimally remove redundant information in scRNA-seq data while preserving the biologically relevant heterogeneity.

An important concept when building metacells is the graining level (γ), which we define as the ratio between the number of single cells in the initial data and the number of metacells. We suggest applying γ between 10 and 50, which significantly reduces the computational resources needed to perform the downstream analyses while preserving most of the result of the initial (i.e., single-cell) analyses.

Installation

SuperCell requires igraph, RANN, WeightedCluster, corpcor, weights, Hmisc, Matrix, matrixStats, plyr, irlba, grDevices, patchwork, ggplot2. SuperCell uses velocyto.R for RNA velocity.

install.packages("igraph")
install.packages("RANN")
install.packages("WeightedCluster")
install.packages("corpcor")
install.packages("weights")
install.packages("Hmisc")
install.packages("Matrix")
install.packages("patchwork")
install.packages("plyr")
install.packages("irlba")

Installing SuperCell package from gitHub

if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("GfellerLab/SuperCell")

library(SuperCell)

Examples

  1. Building and analyzing metacells with SuperCell
  2. Building metacells with SuperCell and alayzing them with a standard Seurat pipeline
  3. Data integration of metacells built with SuperCell

[License]

SuperCell is developed by the group of David Gfeller at University of Lausanne.

SuperCell is available under GPL-3 License.

For scientific questions, please contact Mariia Bilous (mariia.bilous@unil.ch) or David Gfeller (David.Gfeller@unil.ch).

How to cite

If you use SuperCell in a publication, please cite: Bilous et al. Metacells untangle large and complex single-cell transcriptome networks, BMC Bioinformatics (2022).

Copy Link

Version

Install

install.packages('SuperCell')

Monthly Downloads

577

Version

1.0

License

GPL-3

Maintainer

Leonard Herault

Last Published

September 5th, 2024

Functions in SuperCell (1.0)

supercell_UMAP

Compute UMAP of super-cells
supercell_2_Seurat

Super-cells to Seurat object
supercell_FindAllMarkers

Differential expression analysis of supep-cell data. Most of the parameters are the same as in Seurat FindAllMarkers (for simplicity)
supercell_2_sce

Super-cells to SingleCellExperiment object
supercell_cluster

Cluster super-cell data
supercell_mergeGE

Merging metacell gene expression matrices from several independent SuperCell objects
supercell_merge

Merging independent SuperCell objects
supercell_estimate_velocity

supercell_assign

Assign super-cells to the most aboundant cluster
supercell_plot

Plot metacell NW
supercell_plot_GE

Plot super-cell NW colored by an expression of a gene (gradient color)
supercell_plot_UMAP

Plot super-cell UMAP (Use supercell_DimPlot instead) Plots super-cell UMAP (result of supercell_UMAP)
supercell_GeneGenePlot

Gene-gene correlation plot
supercell_VlnPlot

Violin plots
supercell_silhouette

Compute Silhouette index accounting for samlpe size (super cells size) ###
supercell_plot_tSNE

Plot super-cell tSNE (Use supercell_DimPlot instead) Plots super-cell tSNE (result of supercell_tSNE)
supercell_tSNE

Compute tSNE of super-cells
supercell_VlnPlot_single

Plot Violin plot for 1 feature
supercell_prcomp

compute PCA for super-cell data (sample-weighted data)
supercell_purity

Compute purity of super-cells
supercell_rescale

Rescale supercell object
metacell2_anndata_2_supercell

Convert Metacells (Metacell-2) to Super-cell like object
knn_graph_from_dist

Build kNN graph from distance (used in "build_knn_graph")
build_knn_graph

Build kNN graph
SCimplify_from_embedding

Detection of metacells with the SuperCell approach from low dim representation
build_knn_graph_nn2

Build kNN graph using RANN::nn2 (used in "build_knn_graph")
SCimplify_for_velocity

Construct super-cells from spliced and un-spliced matrices
sc_mixing_score

Compute mixing of single-cells within supercell
anndata_2_supercell

Convert Anndata metacell object (Metacell-2 or SEACells) to Super-cell like object
SCimplify

Detection of metacells with the SuperCell approach
cell_lines

Cancer cell lines dataset
supercell_GeneGenePlot_single

Plot Gene-gene correlation plot for 1 feature
supercell_DimPlot

Plot metacell 2D plot (PCA, UMAP, tSNE etc)
supercell_FindMarkers

Differential expression analysis of supep-cell data. Most of the parameters are the same as in Seurat FindMarkers (for simplicity)
supercell_GE_idx

Simplification of scRNA-seq dataset (old version, not used since 12.02.2021)
supercell_GE

Simplification of scRNA-seq dataset