Learn R Programming

iCheck (version 1.2.0)

genSimData.BayesNormal: Generating simulated data set from conditional normal distributions

Description

Generating simulated data set from conditional normal distributions.

Usage

genSimData.BayesNormal( nCpGs, nCases, nControls, mu.n = -2, mu.c = 2, d0 = 20, s02 = 0.64, s02.c = 1.5, testPara = "var", outlierFlag = FALSE, eps = 0.001, applier = lapply)

Arguments

nCpGs
integer. Number of genes.
nCases
integer. Number of cases.
nControls
integer. Number of controls.
mu.n
numeric. mean of the conditional normal distribution for controls. See details.
mu.c
numeric. mean of the conditional normal distribution for cases. See details.
d0
integer. degree of freedom for scale-inverse chi squared distribution. See details.
s02
numeric. scaling parameter for scale-inverse chi squared distribution for controls. See details.
s02.c
numeric. scaling parameter for scale-inverse chi squared distribution for cases. See details.
testPara
character string. indicating if the test is for testing equal mean, equal variance, or both.
outlierFlag
logical. indicating if outliers would be generated. If outlierFlag=TRUE, then we followed Phipson and Oshlack's (2014) simulation studies to generate one outlier for each CpG site by replacing the DNA methylation level of one diseased subject by the maximum of the DNA methylation levels of all CpG sites.
eps
numeric. if $|mean0-mean1|
applier
function name to do apply operation.

Value

An ExpressionSet object. The phenotype data of the ExpressionSet object contains 2 columns: arrayID (array id) and memSubj (subject membership, i.e., case (memSubj=1) or control (memSubj=0)). The feature data of the ExpressionSet object contains 4 elements: probe (probe id), gene (psuedo gene symbol), chr (psuedo chromosome number), and memGenes (indicating if a gene is differentially expressed (when testPara="mean") or indicating if a gene is differentially variable (when testPara="var") ).

Details

Based on Phipson and Oshlack's (2014) simulation algorithm. For each CpG site, variance of the DNA methylation was first sampled from an scaled inverse chi-squared distribution with degree of freedom $d_0$ and scaling parameter $s_0^2$: $\sigma^2_i ~ scale-inv \chi^2(d_0, s_0^2)$. M value for each CpG was then sampled from a normal distribution with mean $\mu_n$ and variance equal to the simulated variance $\sigma^2_i$. For cases, the variance was first generated from $\sigma^2_{i,c} ~ scale-inv \chi^2(d_0, s_{0,c}^2)$. M value for each CpG was then sampled from a normal distribution with mean $\mu_c$ and variance equal to the simulated variance $\sigma^2_{i,c}$.

References

Phipson B, Oshlack A. DiffVar: A new method for detecting differential variability with application to methylation in cancer and aging. Genome Biol 2014; 15:465

Examples

Run this code
    # generate simulated data set from conditional normal distribution
    set.seed(1234567)
    es.sim = genSimData.BayesNormal(nCpGs = 100, 
      nCases = 20, nControls = 20,
      mu.n = -2, mu.c = 2,
      d0 = 20, s02 = 0.64, s02.c = 1.5, testPara = "var",
      outlierFlag = FALSE, 
      eps = 1.0e-3, applier = lapply) 
    print(es.sim)

Run the code above in your browser using DataLab