simulate_bulk: Simulate pseudo-bulk RNA-Seq

Description

Simulates pseudo-bulk RNA-Seq dataset using two modes. The first mode uses a 'cellMarkers' class object and a matrix of counts for the numbers of cells of each cell subclass. This method converts the log2 gene means back for each cell subclass back to count scale and then calculates pseudo-bulk count values based on the cell amounts specified in samples. In the 2nd mode, a single-cell RNA-Seq dataset is required, such as a matrix used as input to cellMarkers(). Cells from the relevant subclass are sampled from the single-cell matrix in the appropriate amounts based on samples, except that sampling is scaled up by the factor times.

Usage

simulate_bulk(
  object,
  samples,
  subclass,
  times = 1,
  method = c("dirichlet", "unif"),
  alpha = 1
)

Value

An integer count matrix with genes in rows and cell subclasses in columns. This can be used as test with the deconvolute() function.

Arguments

object: Either a 'cellMarkers' class object, or a single cell count matrix with genes in rows and cells in columns, with rownames representing gene IDs/symbols. The matrix can be a sparse matrix or DelayedMatrix.
samples: An integer matrix of cell counts with samples in rows and columns for each cell subclass in object. This can be generated using generate_samples().
subclass: Vector of cell subclasses matching the columns in object. Only used if object is a single cell count matrix.
times: Scaling factor to increase sampling of cells. Cell counts in samples are scaled up by being multiplied by this number. Only used if object is a single cell count matrix.
method: Either "dirichlet" or "unif" to specify whether cells are sampled based on the Dirichlet distribution with K = number of cells in each subclass, or sampled uniformly. When cells are oversampled uniformly, in the limit the summed gene expression tends to the arithmetic mean of the subclass x sample frequency. Dirichlet sampling provides proper randomness with sampling.
alpha: Shape parameter for Dirichlet sampling.

Details

The first method can give perfect deconvolution if the following settings are used with deconvolute(): count_space = TRUE, convert_bulk = FALSE, use_filter = FALSE and comp_amount = 1.

Description

Usage

Value

Arguments

Details

See Also