Learn R Programming

Rank Constrained Similarity Learning (RCSL)

RCSL is an R toolkit for single-cell clustering and trajectory analysis using single-cell RNA-seq data.

Installation

This package can be installed through devtools in R:

$ R
> library("devtools")
> devtools::install_github("QinglinMei/RCSL")

Now RCSL can be loaded in R:

> library(RCSL)

Input

The input of RCSL is a normalized data matrix with columns being cells and rows being genes in log(CPM+1), log(RPKM+1), log(TPM+1) or log(FPKM+1) format; or a data file in RDS format.

Usage

We provide an example script to run RCSL in demo_RCSL.R.

The nine functions of RCSL can also be run independently.

FunctionDescription
GenesFilterPerform genes filtering.
SimSCalculate the initial similarity matrix S.
NeigRepresentCalculate the neighbor representation of cells.
EstClustersEstimate the optimal number of clusters C.
BDSMLearn the block-diagonal matrix B.
PlotMSTConstruct MST based on clustering results from RCSL.
PlotPseudoTimeInfer the pseudo-temporal ordering of cells.
getLineageInfer the lineage based on the clustering results and the starting cell.
PlotTrajectoryPlot the developmental trajectory based on the clustering results and the starting cell.

Example:

Load packages:

> library(RCSL)
> library(SingleCellExperiment)
> library(ggplot2)
> library(igraph)

Load Yan dataset:

> origData <- yan
> data <- logcounts(origData+1)
> label <- origData$cell_type1
> DataName <- "Yan"

Generating clustering result:

> res_RCSL <- RCSL(data)

Calculating Adjusted Rand Index:

> ARI_RCSL <- igraph::compare(res_RCSL$y, label, method = "adjusted.rand")

Trajectory analysis:

> label <- origData$cell_type1
> res_TrajecAnalysis <- TrajectoryAnalysis(res_RCSL$gfData, res_RCSL$drData, res_RCSL$S,
                                         clustRes = res_RCSL$y, TrueLabel = label, startPoint = 1,
                                         dataName = DataName)

Display the plot of constructed MST:

> res_TrajecAnalysis$MSTPlot

Display the plot of the pseudo-temporal ordering

> res_TrajecAnalysis$PseudoTimePlot

Display the plot of the inferred developmental trajectory

> res_TrajecAnalysis$TrajectoryPlot

A vignette in R Notebook format is available here

Required annotations for RCSL

  1. The RCSL package requires three extra packages: namely the SingleCellExperiment package (see https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) to read the SingleCellExperiment object, the igraph package (see https://igraph.org/) to find the strongest connected components and the ggplot2 package (see https://cran.r-project.org/web/packages/ggplot2/index.html) to plot the developmental trajectory and MST.
  2. The data for the demonstration purpose in the directory Data was from https://hemberg-lab.github.io/scRNA.seq.datasets/. This data is stored in both RDS and text formats.

DEBUG

Please feel free to contact us if you have problems running our tool at meiqinglinkf@163.com.

Copy Link

Version

Install

install.packages('RCSL')

Monthly Downloads

2

Version

0.99.95

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Qinglin Mei

Last Published

April 19th, 2021

Functions in RCSL (0.99.95)

EstClusters

Estimate the optimal number of clusters C for clustering
PlotPseudoTime

Infer the pseudo-temporal ordering between the cell types using the distance from a cell type to the predefined starting cell type.
BDSM

Calculate the bolock-diagnal matrix B min_B>=0, B*1=1, F'*F=I ||B - A||_1 + r*||B||^2 + 2*lambda*trace(F'*L*F)
PlotMST

Plot the visualization of constructed Minimum Spanning Tree based on the clustering results of RCSL
PlotTrajectory

Infer the developmental trajectories based on the clustering results from RCSL
GenesFilter

Perform the step of gene filtering to normalizaed gene expression data
EucDist

Solve the problem: ||A-B||^2 = ||A||^2 + ||B||^2 - 2*A'*B
RCSL

Perform the RCSL program
EProjSimplexdiag

Solve the problem: min 1/2*x'*L*x-x'*d s.t. x>=0, 1'x=1
NeigRepresent

Calculate the neighbor representation of cells to the low-dimensional gene expression matrix
ann

Cell type annotations of `yan` datasets by Yan et al.
SimS

Calculate the initial similarity matrix
getLineage

Infer the development lineage based on the clustering results from RCSL and the pseudotime
TrajectoryAnalysis

Trajectory analysis
yan

A public scRNA-seq dataset by Yan et al.