stepFlexmix
Run FlexMix Repeatedly
Runs flexmix repeatedly for different numbers of components and returns the maximum likelihood solution for each.
- Keywords
- regression, cluster
Usage
initFlexmix(..., k, init = list(), control = list(), nrep = 3L,
verbose = TRUE, drop = TRUE, unique = FALSE)
initMethod(name = c("tol.em", "cem.em", "sem.em"),
step1 = list(tolerance = 10^-2),
step2 = list(), control = list(), nrep = 3L)stepFlexmix(..., k=NULL, nrep=3, verbose=TRUE, drop=TRUE,
unique=FALSE, multicore = TRUE)
## S3 method for class 'stepFlexmix,missing':
plot(x, y, what=c("AIC", "BIC", "ICL"),
xlab=NULL, ylab=NULL, legend="topright", ...)
## S3 method for class 'stepFlexmix':
getModel(object, which="BIC")
## S3 method for class 'stepFlexmix':
unique(x, incomparables = FALSE, ...)
Arguments
- ...
- Passed to
flexmix
(ormatplot
in theplot
method). - k
- A vector of integers passed in turn to the
k
argument offlexmix
. - init
- An object of class
"initMethod"
or a named list whereinitMethod
is called with it as arguments in addition to thecontrol
argument. - name
- A character string indication which initialization
strategy should be employed: short runs of EM followed by a long
(
"tol.em"
), short runs of CEM followed by a long EM run ("cem.em"
), short runs of SEM followed by a l - step1
- A named list which combined with the
control
argument is coercable to a"FLXcontrol"
object. This control setting is used for the short runs. - step2
- A named list which combined with the
control
argument is coercable to a"FLXcontrol"
object. This control setting is used for the long run. - control
- A named list which combined with the
step1
or thestep2
argument is coercable to a"FLXcontrol"
object. - nrep
- For each value of
k
runflexmix
nrep
times and keep only the solution with maximum likelihood. Ifnrep
is set for the long run, it is ignored, because the - verbose
- If
TRUE
, show progress information during computations. - drop
- If
TRUE
andk
is of length 1, then a single flexmix object is returned instead of a"stepFlexmix"
object. - unique
- If
TRUE
, thenunique()
is called on the result, see below. - multicore
- If
TRUE
, usemclapply()
from packageparallel for parallel processing. If an object of class"cluster"
, useparLapply()
from packageparallel . IfFALSE
, no - x, object
- An object of class
"stepFlexmix"
. - y
- Not used.
- what
- Character vector naming information criteria to
plot. Functions of the same name must exist, which take a
stepFlexmix
object as input and return a numeric vector likeAIC,stepFlexmix-method
(see examples below). - xlab,ylab
- Graphical parameters.
- legend
- If not
FALSE
andwhat
contains more than 1 element, a legend is placed at the specified location, seelegend
for details. - which
- Number of model to get. If character, interpreted as number of components or name of an information criterion.
- incomparables
- A vector of values that cannot be
compared. Currently,
FALSE
is the only possible value, meaning that all values can be compared.
Value
- An object of class
"stepFlexmix"
containing the best models with respect to the log likelihood for the different number of components in a slot iflength(k)>1
, else directly an object of class"flexmix"
.If
unique=FALSE
, then the resulting object contains one model per element ofk
(which is the number of clusters the EM algorithm started with). Ifunique=TRUE
, then the result is resorted according to the number of clusters contained in the fitted models (which may be less than the number with which the EM algorithm started), and only the maximum likelihood solution for each number of fitted clusters is kept. This operation can also be done manually by callingunique()
on objects of class"stepFlexmix"
.
References
Friedrich Leisch. FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software, 11(8), 2004. http://www.jstatsoft.org/v11/i08/
Christophe Biernacki, Gilles Celeux and Gerard Govaert. Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics & Data Analysis, 41(3-4), 561--575, 2003.
Theresa Scharl, Bettina Gruen and Friedrch Leisch. Mixtures of regression models for time-course gene expression data: Evaluation of initialization and random effects. Bioinformatics, 26(3), 370--377, 2010.
Examples
data("Nclus", package = "flexmix")
## try 5 times for k=4
set.seed(511)
ex1 <- initFlexmix(Nclus~1, k=4, model=FLXMCmvnorm(diagonal=FALSE),
nrep = 5)
ex1
## now 3 times each for k=2:6, specify control parameter
ex2 <- initFlexmix(Nclus~1, k=2:6, model=FLXMCmvnorm(diagonal=FALSE),
control=list(minprior=0), nrep=3)
ex2
plot(ex2)
## get BIC values
BIC(ex2)
## get smallest model
getModel(ex2, which=1)
## get model with 3 components
getModel(ex2, which="3")
## get model with smallest ICL (here same as for AIC and BIC: true k=4)
getModel(ex2, which="ICL")
## now 1 time each for k=2:6, with larger minimum prior
ex3 <- initFlexmix(Nclus~1, k=2:6, model=FLXMCmvnorm(diagonal=FALSE),
control=list(minprior=0.1), nrep=1)
ex3
## keep only maximum likelihood solution for each unique number of
## fitted clusters:
unique(ex3)