Learn R Programming

ssizeRNA (version 1.1.2)

sim.counts: RNA-seq Count Data Simulation from Negative-Binomial Distribution

Description

This function simulates count data from Negative-Binomial distribution for RNA-seq experiments with given mean, dispersion and log fold change. A count data matrix is generated.

Usage

sim.counts(arg, mu, disp, logfc, up = 0.5, replace = TRUE)

Arguments

arg
a list of global parameters to pass into the function, such as total number of genes, proportion of non-differentially expressed genes and treatment groups. See Details for more information.
mu
a vector (or scalar) of mean counts in control group from which to simulate.
disp
a vector (or scalar) of dispersion parameter from which to simulate.
logfc
a vector (or scalar) of log fold change between treatment group and control group.
up
proportion of up-regulated genes among all differentially expressed genes, the default value is 0.5.
replace
sample with or without replacement from given parameters. See Details for more information.

Value

  • countsRNA-seq count data matrix.
  • grouptreatment group vector.
  • lambda0mean counts in control group for each gene.
  • phi0dispersion parameter for each gene.
  • dedifferentially expressed genes indicator: 0 for non-differentially expressed genes, 1 for up-regulated genes, -1 for down-regulated genes.
  • deltalog fold change for each gene between treatment group and control group.

Details

arg = list(nG, pi0, group) where nG is the total number of genes, pi0 is the proportion of non-differentially expressed genes, and group is the treatment groups.

If the total number of genes is larger than length of mu or disp, replace always equals TRUE.

Examples

Run this code
arg <- list(nG = 10000,                      ## total number of genes
            pi0 = 0.8,                       ## proportion of non-differentially expressed genes
            group = rep(c(1, 2), each = 3))  ## treatment groups
mu <- 10                                     ## mean counts in control group for all genes
disp <- 0.1                                  ## dispersion for all genes
logfc <- log(2)                              ## log fold change for up-regulated genes

RNA_simu <- sim.counts(arg, mu, disp, logfc, up = 0.5, replace = TRUE)
RNA_simu$counts                              ## count data matrix

Run the code above in your browser using DataLab