Learn R Programming

cellGeometry (version 0.5.7)

simulate_bulk: Simulate pseudo-bulk RNA-Seq

Description

Simulates pseudo-bulk RNA-Seq dataset using two modes. The first mode uses a 'cellMarkers' class object and a matrix of counts for the numbers of cells of each cell subclass. This method converts the log2 gene means back for each cell subclass back to count scale and then calculates pseudo-bulk count values based on the cell amounts specified in samples. In the 2nd mode, a single-cell RNA-Seq dataset is required, such as a matrix used as input to cellMarkers(). Cells from the relevant subclass are sampled from the single-cell matrix in the appropriate amounts based on samples, except that sampling is scaled up by the factor times.

Usage

simulate_bulk(
  object,
  samples,
  subclass,
  times = 1,
  method = c("dirichlet", "unif"),
  alpha = 1
)

Value

An integer count matrix with genes in rows and cell subclasses in columns. This can be used as test with the deconvolute() function.

Arguments

object

Either a 'cellMarkers' class object, or a single cell count matrix with genes in rows and cells in columns, with rownames representing gene IDs/symbols. The matrix can be a sparse matrix or DelayedMatrix.

samples

An integer matrix of cell counts with samples in rows and columns for each cell subclass in object. This can be generated using generate_samples().

subclass

Vector of cell subclasses matching the columns in object. Only used if object is a single cell count matrix.

times

Scaling factor to increase sampling of cells. Cell counts in samples are scaled up by being multiplied by this number. Only used if object is a single cell count matrix.

method

Either "dirichlet" or "unif" to specify whether cells are sampled based on the Dirichlet distribution with K = number of cells in each subclass, or sampled uniformly. When cells are oversampled uniformly, in the limit the summed gene expression tends to the arithmetic mean of the subclass x sample frequency. Dirichlet sampling provides proper randomness with sampling.

alpha

Shape parameter for Dirichlet sampling.

Details

The first method can give perfect deconvolution if the following settings are used with deconvolute(): count_space = TRUE, convert_bulk = FALSE, use_filter = FALSE and comp_amount = 1.

See Also

generate_samples() deconvolute() add_noise()