This dataset was generated in the "Generate Simulation Datasets" step in the Parametric_simulation.rmd (https://github.com/Lujun995/DiSC_Replication_Code)
data("sim_data")
It contains 12 cases and 12 controls, each with 375 cell replicates. The read depths of each cell replicate are well-balanced. A covariate called RIN (RNA Integrity Number) at the individual level is included in the dataset.
The dataset comprises a total of 1,000 genes. The signal density was 15%, with differences in mean, variance, and mean+variance (each at 5%). The ground truth of differential/equally expression genes are indicated by gene_index
, including mean_index
(genes with a difference in mean), var_index
(genes with a difference in variance), mean_var_index
(genes with a difference in both mean and variance), EE_index
(otherwise (to estimate type-I error)).
A list of elements:
count_matrix
A numeric count matrix.
meta_cell
A data.frame of the metadata at the cell level.
meta_ind
A data.frame of the metadata at the individual level.
gene_index
A list of 4 numeric vectors representing the ground truth of the IDs of the differentially or equally expressed genes.