Learn R Programming

SPECK (version 1.0.1)

speck: Abundance estimation for single cell RNA-sequencing (scRNA-seq) data.

Description

Performs normalization, reduced rank reconstruction (RRR) and thresholding for a \(m x n\) scRNA-seq matrix with \(m\) samples and \(n\) genes. The speck() function calls the randomizedRRR() function on the scRNA-seq matrix. Thresholding is next applied to each gene from the \(m x n\) RRR matrix using the ckmeansThreshold() function, resulting in a \(m x n\) thresholded matrix. See documentation for the randomizedRRR() and ckmeansThreshold() functions for individual implementation details.

Usage

speck(
  counts.matrix,
  rank.range.end = 100,
  min.consec.diff = 0.01,
  rep.consec.diff = 2,
  manual.rank = NULL,
  max.num.clusters = 4,
  seed.rsvd = 1,
  seed.ckmeans = 2
)

Value

  • thresholded.mat - A \(m x n\) thresholded RRR matrix with \(m\) samples and \(n\) genes.

  • rrr.mat - A \(m x n\) RRR matrix with \(m\) samples and \(n\) genes.

  • rrr.rank - Automatically computed rank.

  • component.stdev - A vector corresponding to standard deviations of non-centered sample principal components.

  • clust.num - A vector of length \(n\) indicating the number of clusters identified by the Ckmeans.1d.dp::Ckmeans.1d.dp() algorithm for each gene.

  • clust.max.prop - A vector of length \(n\) indicating the proportion of samples with the specified maximum number of clusters for each gene.

Arguments

counts.matrix

\(m x n\) scRNA-seq counts matrix with \(m\) samples and \(n\) genes.

rank.range.end

Upper value of the rank for RRR.

min.consec.diff

Minimum difference in the rate of change between a pair of successive standard deviation estimate.

rep.consec.diff

Frequency of the minimum difference in the rate of change between a pair of successive standard deviation estimate.

manual.rank

Optional, user-specified upper value of the rank used for RRR as an alternative to automatically computed rank.

max.num.clusters

Maximum number of clusters for computation.

seed.rsvd

Seed specified to ensure reproducibility of the RRR.

seed.ckmeans

Seed specified to ensure reproducibility of the clustered thresholding.

Examples

Run this code
set.seed(10)
data.mat <- matrix(data = rbinom(n = 18400, size = 230, prob = 0.01), nrow = 80)
speck.full <- speck(counts.matrix = data.mat, rank.range.end = 60,
min.consec.diff = 0.01, rep.consec.diff = 2,
manual.rank = NULL, max.num.clusters = 4,
seed.rsvd = 1, seed.ckmeans = 2)
print(speck.full$component.stdev)
print(speck.full$rrr.rank)
head(speck.full$clust.num); table(speck.full$clust.num)
head(speck.full$clust.max.prop); table(speck.full$clust.max.prop)
speck.output <- speck.full$thresholded.mat
dim(speck.output); str(speck.output)

Run the code above in your browser using DataLab