Workhorse function ('slice apply') designed to handle large scRNA-Seq gene expression matrices such as embedded Seurat matrices, and apply a function to the whole matrix. Very large matrices are handled by slicing rows into blocks to avoid excess memory requirements.
slapply(x, FUN, combine = "c", progress = TRUE, sliceMem = 16, cores = 1L, ...)The returned data type will depend on the functions specified by
FUN and combine.
matrix, sparse matrix or DelayedMatrix of raw counts with genes in rows and cells in columns.
Function to be applied to each subblock of the matrix.
A function or a name of a function to combine results after
slicing. As the function is usually applied to blocks of 30000 genes or so,
the result is usually a vector with an element per gene. Hence 'c' is the
default function for combining vectors into a single longer vector. However
if each gene row returns a number of results (e.g. a vector or dataframe),
then combine could be set to 'rbind'.
Logical, whether to show progress.
Max amount of memory in GB to allow for each subsetted count
matrix object. When x is subsetted by each cell subclass, if the amount
of memory would be above sliceMem then slicing is activated and the
subsetted count matrix is divided into chunks and processed separately.
The limit is just under 17.2 GB (2^34 / 1e9). At this level the subsetted
matrix breaches the long vector limit (>2^31 elements).
Integer, number of cores to use for parallelisation using
mclapply(). Parallelisation is not available on windows. Warning:
parallelisation has increased memory requirements.
Optional arguments passed to FUN.
Myles Lewis
The limit on sliceMem is that the number of elements manipulated in each
block must be kept below the long vector limit of 2^31 (around 2e9).
Increasing cores requires substantial amounts of spare RAM. combine works
in a similar way to .combine in foreach() across slices of genes; it is
only invoked if slicing occurs.
scapply()