Learn R Programming

cellGeometry (version 0.5.7)

slapply: Apply a function to a big matrix by slicing

Description

Workhorse function ('slice apply') designed to handle large scRNA-Seq gene expression matrices such as embedded Seurat matrices, and apply a function to the whole matrix. Very large matrices are handled by slicing rows into blocks to avoid excess memory requirements.

Usage

slapply(x, FUN, combine = "c", progress = TRUE, sliceMem = 16, cores = 1L, ...)

Value

The returned data type will depend on the functions specified by FUN and combine.

Arguments

x

matrix, sparse matrix or DelayedMatrix of raw counts with genes in rows and cells in columns.

FUN

Function to be applied to each subblock of the matrix.

combine

A function or a name of a function to combine results after slicing. As the function is usually applied to blocks of 30000 genes or so, the result is usually a vector with an element per gene. Hence 'c' is the default function for combining vectors into a single longer vector. However if each gene row returns a number of results (e.g. a vector or dataframe), then combine could be set to 'rbind'.

progress

Logical, whether to show progress.

sliceMem

Max amount of memory in GB to allow for each subsetted count matrix object. When x is subsetted by each cell subclass, if the amount of memory would be above sliceMem then slicing is activated and the subsetted count matrix is divided into chunks and processed separately. The limit is just under 17.2 GB (2^34 / 1e9). At this level the subsetted matrix breaches the long vector limit (>2^31 elements).

cores

Integer, number of cores to use for parallelisation using mclapply(). Parallelisation is not available on windows. Warning: parallelisation has increased memory requirements.

...

Optional arguments passed to FUN.

Author

Myles Lewis

Details

The limit on sliceMem is that the number of elements manipulated in each block must be kept below the long vector limit of 2^31 (around 2e9). Increasing cores requires substantial amounts of spare RAM. combine works in a similar way to .combine in foreach() across slices of genes; it is only invoked if slicing occurs.

See Also

scapply()