sampleImputation: Sample-mean Estimation

Description

Cluster cells using SNN and a list of given genes, estimate missing expression values for each cell-gene combination with the within-cluster non-zero expression mean

Usage

sampleImputation(
  expression_matrix,
  subset_genes = NULL,
  scale_data = TRUE,
  number_pcs = 8,
  k_neighbors = 30,
  snn_resolution = 0.9,
  impute_index = NULL,
  pseudo_zero = NULL,
  python_path = NULL,
  verbose = FALSE
)

Arguments

expression_matrix

Row by column log-normalized expression matrix

subset_genes

A vector of informative gene names, defaults to all genes

scale_data

Whether to standardize expression by gene, default TRUE

number_pcs

Number of dimensions to inform SNN clustering

k_neighbors

Number of k neighbors to use for NN network

snn_resolution

Resolution parameter for SNN

impute_index

Index to impute, will default to all zeroes

pseudo_zero

Pseudo-zero expression value

python_path

path to your python binary (default = system path)

verbose

Print progress output to the console

Value

Returns a sparse matrix of class 'dgCMatrix'

Examples

Run this code

# NOT RUN {
set.seed(0)
requireNamespace("Matrix")

## generate (meaningless) counts
c1 <- stats::rpois(5e3, 1)
c2 <- stats::rpois(5e3, 2)
m <- t(
  rbind(
    matrix(c1, nrow = 20),
    matrix(c2, nrow = 20)
  )
)

## construct an expression matrix m
colnames(m) <- paste0('cell', 1:ncol(m))
rownames(m) <- paste0('gene', 1:nrow(m))
m <- log(m/colSums(m)*1e4 + 1)
m <- methods::as(m, 'dgCMatrix')

## impute
# }
# NOT RUN {
m_imputed <- rescue::sampleImputation(
  expression_matrix = m,
  k_neighbors = 10
)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab