Simulates a gene/SNP pair with baseline covariates XX,
cell type compositions true_RHO, phased SNP genotypes true_SNP,
and total (TReC) and allele-specific read counts (ASReC) contained in dat.
CSeQTL_dataGen(
NN,
MAF,
true_BETA0 = log(1000),
true_KAPPA,
true_ETA,
true_PHI = 0.1,
true_PSI = 0.05,
prob_phased = 0.05,
true_ALPHA = NULL,
batch = 1,
RHO = NULL,
cnfSNP = FALSE,
show = TRUE
)A R list containing true parameters governing the simulated dataset,
simulated covariate matrix XX, observed outcomes in dat.
Positive integer for sample size.
Positive numeric value between 0 and 1 for the minor allele frequency to simulate phased SNP genotypes assuming Hardy-Weinberg.
A positive numeric value denoting the reference cell type
and reference base's expression multiplied by two and log transformed.
For example, if the TReC for reference base and cell type is 500, then
true_BETA0 = log{2 * 500}.
A numeric vector denoting the baseline fold change in TReC between a cell type and reference. By definition, the first element is 1.
A numeric vector where each element denotes the fold change in TReC between the non-reference and reference base in a cell type.
A non-negative numeric value denoting the over-dispersion term
associated with TReC. If true_PHI > 0, TReC is simulated with the
negative binomial. If true_PHI = 0, TReC is simulated with the poisson.
A non-negative numeric value denoting the over-dispersion term
associated with ASReC. If true_PSI > 0, ASReC is simulated with the
beta-binomial, otherwise it is simulated with the binomial distribution.
A positive numeric value denoting the simulated proportion of simulated TReC that are ASReC.
By default, it is set to NULL setting each cell
type with an eQTL to be cis-eQTL. Otherwise, a positive numeric vector
of fold changes between TReC eQTL effect sizes and ASReC eQTL effect sizes.
A numeric value set to 1 by default to allow underlying batch effects. Set to zero to eliminate batch effects.
A numeric matrix of cell type proportions where each row sums to one.
If set to NULL, a matrix of cell type proportions will be simulated.
A boolean value where TRUE re-arranges simulated SNPs to
correlate with baseline bulk expression. When fitting the marginal model
(not accounting for cell type proportions) and in the presence of cell
type-specific differentiated expression, a marginal eQTL may be incorrectly inferred.
A boolean value to display verbose output and plot intermediate simulated results.