Learn R Programming

boclust

The goal of boclust is to provide a new normalization method for sparse data by a feature boosting strategy with the latent representation, especially for scRNA-seq data consisted of many zeros. Based on the normalization, a new measure of similarity is defined for the following clustering algorithm. Unlike other unsupervised cluster methods, boclust provides the suggestion K to determine the number of clusters. In this way, it may be unsuitable for low-dimentional data.

There are three major functions:

  • BossaSimi: to calculate the similarity matrix and normalized data.
  • BossaClust: the main function which provide an object including clustering result for shiny.
  • bossa_interactive: a shiny framework to show the clustering result.

Installation

You can install boclust from github with:

# install.packages("devtools")
devtools::install_github("TinyOpen/boclust")

Example

# generate sparse data from the toy model of CIDR
sparse.data <- data.frame(g.1 = c(0, 5, 0, 6, 8, 6, 7, 7), 
                          g.2 = c(5, 0, 0, 0, 5, 7, 5, 7)) 
bossa.change <- BossaSimi(sparse.data, is.pca = FALSE) # with low-dimensional data, pca is uncessary
data.after <- bossa.change$U.score.non.pca # data after normalization

You can check after normalization, the first 4 cells which are actually from the same cluster are more closer. The seperation between the first 4 cells and the last 4 cells is large enough to get the correct clustering result.

d3heatmap(sparse.data) ## show heatmap of original data
d3heatmap(data.after) ## show heatmap of bossa-normalized data 

Now, when it comes to your high-dimentional data, which is the target which boclust is designed for. You can either use BossaClust to get the final result:

object <- BossaClust(high.dim.data) # do normalization and clustering at the same time
bossa_interactive(object) # use shiny frame to show the result

Or, you can store the normalized data first, which is obtained from function BossaSimi, and then do the rest work.

pre.object <- BossaSimi(high.dim.data)
object <- BossaClust(data = high.dim.data, data.pre = pre.object) # do normalization and clustering at the same time
bossa_interactive(object) # use shiny frame to show the result

Copy Link

Version

Install

install.packages('boclust')

Monthly Downloads

3

Version

0.1.1

License

GPL-2

Maintainer

Kaixiu Jin

Last Published

November 23rd, 2017

Functions in boclust (0.1.1)

FindHcDe

Find DE from HC with recommended k
OrderClust

Reindex cluster labels in ascending order
FindOverlapDe

Find DE from Overlap clusters
BossaSimi

Bossa Similarity
OverlapClust

Overlap Clustering
ClustMerge

Clust Merge
ClustShare

Calculate the share matrix among overlap clusters.
OverlapMelt

Prepare the data for visualization
FindBefDe

Find DE from original overlap clusters
TestMergeClust

Do a test before merging two overlap clusters.
AssignLeftClust

Assign the left points
bo.simu.data

A data of 300 cells and 200 genes.
BossaClust

Bossa Clustering
bossa_interactive

Opens BOSSA results in an interactive session in a web browser.
KeyFeature

Determine key features of each overlap clusters.