emfit: Implements EM algorithm for gene expression mixture model

Description

Implements the EM algorithm for gene expression mixture model

Usage

emfit(data, family, hypotheses, ...)

Arguments

data

a matrix

family

an object of class ``ebarraysFamily'' or a character string which can be coerced to one. Currently, only the characters "GG" and "LNN", and "LNNMV" are valid. For LNNMV, a groupid is required. See below. Other families can be supplied by constructing them explicitly.

hypotheses

an object of class ``ebarraysPatterns'' representing the hypotheses of interest. Such patterns can be generated by the function ebPatterns

...

other arguments. These include:

cluster: if type=1, cluster is a vector specifying the fixed cluster membership for each gene; if type=2, cluster specifies the number of clusters to be fitted
type: if type=1, the cluster membership is fixed as input cluster; if type=2, fit the data with a fixed number of clusters
criterion: only used when type=2 and cluster contains more than one integers. All numbers of clusters provided in cluster will be fitted and the one that minimizes criterion will be returned. Possible values now are "BIC", "AIC" and "HQ"
cluster.init: only used when type=2. Specify the initial clustering membership.
num.iter: number of EM iterations
verbose: logical or numeric (0,1,2) indicating desired level of information printed for the user
optim.control: list passed unchanged to optim for finer control
groupid: an integer vector indicating which group each sample belongs to, required in the ``LNNMV'' model. It does not depend on ``hypotheses''.

Value

an object of class ``ebarraysEMfit'', that can be summarized by show() and used to generate posterior probabilities using postprob

Details

There are many optional arguments. So a call might look more like this: emfit(data, family, hypotheses, cluster, type=2, criterion="BIC", cluster.init = NULL, num.iter = 20, verbose = getOption("verbose"), optim.control = list(), ...)

References

Newton, M.A., Kendziorski, C.M., Richmond, C.S., Blattner, F.R. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology 8:37-52.

Kendziorski, C.M., Newton, M.A., Lan, H., Gould, M.N. (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine 22:3899-3914.

Newton, M.A. and Kendziorski, C.M. Parametric Empirical Bayes Methods for Microarrays in The analysis of gene expression data: methods and software. Eds. G. Parmigiani, E.S. Garrett, R. Irizarry and S.L. Zeger, New York: Springer Verlag, 2003.

Newton, M.A., Noueiry, A., Sarkar, D., and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture model. Biostatistics 5: 155-176.

Yuan, M. and Kendziorski, C. (2006). A unified approach for simultaneous gene clustering and differential expression identification. Biometrics 62(4): 1089-1098.

Examples

Run this code

data(sample.ExpressionSet) ## from Biobase
eset <- exprs(sample.ExpressionSet)
patterns <- ebPatterns(c("1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1",
                         "1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2"))
gg.fit <- emfit(data = eset, family = "GG", hypotheses = patterns, verbose = TRUE)
show(gg.fit)

Run the code above in your browser using DataLab