
forwsimDiffExpr(fit, x, groups, ngenes, maxBatch, batchSize, fdrmax = 0.05, genelimit, v0thre = 1, B = 100,
Bsummary = 100, trace = TRUE, randomSeed, mc.cores=1)
gagafit
,
as returned by fitGG
) or Normal-Normal fit (type nnfit
returned by fitNN
). ExpressionSet
, exprSet
, data frame or matrix
containing the gene expression measurements used to fit the model.x
is of type ExpressionSet
or
exprSet
, groups
should be the name of the column
in pData(x)
with the groups that one wishes to compare. If
x
is a matrix or a data frame, groups
should be a
vector indicating to which group each column in x
corresponds to.x
is
specified this argument is set to nrow(x)
and data is
simulated from the posterior predictive conditional on x
. If
x
not specified simulation is from the prior predictive. batchSize
*maxBatch
samples per group.ncol(x)/length(unique(groups))
. genelimit
genes with the lowest
probability of being equally expressed across all groups will be
simulated. Setting this limit can significantly increase the
computational speed. v0thre
will be simulated. Setting this limit can significantly increase the
computational speed.trace==TRUE
iteration progress is displayed.as.numeric(Sys.time())
modulus 10^6.multicore
package is available,
mc.cores
indicates the number of cores to use for parallel
computing. Currently only used when fit
is of class nnfit
.data.frame
with the following columns:
plotForwSim
to plot the simulated trajectories,
fitGG
for fitting a GaGa model,
fitNN
for fitting a normal-normal model,
seqBoundariesGrid
for finding the optimal design based
on the forwards simulation output.
powfindgenes
for fixed sample size calculations.#Simulate data and fit GaGa model
set.seed(1)
x <- simGG(n=20,m=2,p.de=.5,a0=3,nu=.5,balpha=.5,nualpha=25)
gg1 <- fitGG(x,groups=1:2,method='EM')
gg1 <- parest(gg1,x=x,groups=1:2)
#Run forward simulation
fs1 <- forwsimDiffExpr(gg1, x=x, groups=1:2,
maxBatch=2,batchSize=1,fdrmax=0.05, B=100, Bsummary=100, randomSeed=1)
#Expected number of true positives for each sample size
tapply(fs1$u,fs1$time,'mean')
#Expected utility for each sample size
samplingCost <- 0.01
tapply(fs1$u,fs1$time,'mean') - samplingCost*(0:2)
#Optimal sequential design
b0seq <- seq(0,20,length=200); b1seq <- seq(0,40,length=200)
bopt <-seqBoundariesGrid(b0=b0seq,b1=b1seq,forwsim=fs1,samplingCost=samplingCost,powmin=0)
bopt <- bopt$opt
plot(fs1$time,fs1$summary,xlab='Additional batches',ylab='E(newly discovered DE genes)')
abline(bopt['b0'],bopt['b1'])
text(.2,bopt['b0'],'Continue',pos=3)
text(.2,bopt['b0'],'Stop',pos=1)
Run the code above in your browser using DataLab