The function simulates DNA methylation data from multiple samples. See references for detailed explanation on statistics.
dataSim(replicates, sites, treatment, percentage = 10, effect = 25,
alpha = 0.4, beta = 0.5, theta = 10, covariates = NULL,
sample.ids = NULL, assembly = "hg18", context = "CpG",
add.info = FALSE)
the number of samples that should be simulated.
the number of CpG sites per sample.
a vector containing treatment information.
the proportion of sites which should be affected by the treatment.
a number or vector specifying the effect size of the treatment. See `Examples'.
shape1 parameter for beta distribution (used for substitution probabilites)
shape2 parameter for beta distribution (used for substitution probabilites)
dispersion parameter for beta distribution (used for substitution probabilites)
a data.frame containing covariates (optional)
will be generated automatically from treatment
,
but can be
overwritten by a character vector containing sample names.
the assembly description (e.g. "hg18")
the experimanteal context of the data (e.g. "CpG")
if set to TRUE, the output will be a list with the first element being the methylbase object and a vector containing the treatment effect sizes of all sites as the second element.
a methylBase object containing simulated methylation data, or a list containing the methylbase object and the indices of all treated sites as the second element.
While the coverage is modeled with a binomial distribution, the function uses
a Beta distribution to simulate the methylation background across all samples.
The parameters alpha
, beta
and theta
determine this beta
distribution and
thereby the methylation values.
The parameters percentage
and effect
determine the proportion
of sites that are
affected by the treatment and the strength of this influence, respectively.
The additional information needed for a valid methylBase.obj is generated as
"dummy
values", but can be overwritten as needed.
# NOT RUN {
data(methylKit)
# Simualte data for 4 samples with 20000 sites each.
# The methylation in 10% of the sites are elevated by 50%.
my.methylBase=dataSim(replicates=4,sites=2000,treatment=c(1,1,0,0),
percentage=10,effect=25)
# Simulate data with variable effect sizes of the treatment
# The methylation in 30% of the sites are elevated by 40%, 50% or 60%.
my.methylBase2=dataSim(replicates=4,sites=2000,treatment=c(1,1,0,0),
percentage=30,effect=10:40)
# }
Run the code above in your browser using DataLab