# stepFlexmix

##### Run FlexMix Repeatedly

Runs flexmix repeatedly for different numbers of components and returns the maximum likelihood solution for each.

- Keywords
- regression, cluster

##### Usage

```
initFlexmix(..., k, init = list(), control = list(), nrep = 3L,
verbose = TRUE, drop = TRUE, unique = FALSE)
initMethod(name = c("tol.em", "cem.em", "sem.em"),
step1 = list(tolerance = 10^-2),
step2 = list(), control = list(), nrep = 3L)
```stepFlexmix(..., k = NULL, nrep = 3, verbose = TRUE, drop = TRUE,
unique = FALSE)

# S4 method for stepFlexmix,missing
plot(x, y, what = c("AIC", "BIC", "ICL"),
xlab = NULL, ylab = NULL, legend = "topright", ...)

# S4 method for stepFlexmix
getModel(object, which = "BIC")

# S4 method for stepFlexmix
unique(x, incomparables = FALSE, ...)

##### Arguments

- …
- k
A vector of integers passed in turn to the

`k`

argument of`flexmix`

.- init
An object of class

`"initMethod"`

or a named list where`initMethod`

is called with it as arguments in addition to the`control`

argument.- name
A character string indication which initialization strategy should be employed: short runs of EM followed by a long (

`"tol.em"`

), short runs of CEM followed by a long EM run (`"cem.em"`

), short runs of SEM followed by a long EM run (`"sem.em"`

).- step1
A named list which combined with the

`control`

argument is coercable to a`"FLXcontrol"`

object. This control setting is used for the short runs.- step2
A named list which combined with the

`control`

argument is coercable to a`"FLXcontrol"`

object. This control setting is used for the long run.- control
A named list which combined with the

`step1`

or the`step2`

argument is coercable to a`"FLXcontrol"`

object.- nrep
For each value of

`k`

run`flexmix`

`nrep`

times and keep only the solution with maximum likelihood. If`nrep`

is set for the long run, it is ignored, because the EM algorithm is deterministic using the best solution discovered in the short runs for initialization.- verbose
If

`TRUE`

, show progress information during computations.- drop
If

`TRUE`

and`k`

is of length 1, then a single flexmix object is returned instead of a`"stepFlexmix"`

object.- unique
If

`TRUE`

, then`unique()`

is called on the result, see below.- x, object
An object of class

`"stepFlexmix"`

.- y
Not used.

- what
Character vector naming information criteria to plot. Functions of the same name must exist, which take a

`stepFlexmix`

object as input and return a numeric vector like`AIC,stepFlexmix-method`

(see examples below).- xlab,ylab
Graphical parameters.

- legend
If not

`FALSE`

and`what`

contains more than 1 element, a legend is placed at the specified location, see`legend`

for details.- which
Number of model to get. If character, interpreted as number of components or name of an information criterion.

- incomparables
A vector of values that cannot be compared. Currently,

`FALSE`

is the only possible value, meaning that all values can be compared.

##### Value

An object of class `"stepFlexmix"`

containing the best models
with respect to the log likelihood for the different number of
components in a slot if `length(k)>1`

, else directly an object of
class `"flexmix"`

.

If `unique = FALSE`

, then the resulting object contains one
model per element of `k`

(which is the number of clusters the EM
algorithm started with). If `unique = TRUE`

, then the result
is resorted according to the number of clusters contained in the
fitted models (which may be less than the number with which the EM
algorithm started), and only the maximum likelihood solution for each
number of fitted clusters is kept. This operation can also be done
manually by calling `unique()`

on objects of class
`"stepFlexmix"`

.

##### References

Friedrich Leisch. FlexMix: A general framework for finite mixture
models and latent class regression in R. *Journal of Statistical
Software*, **11**(8), 2004. doi:10.18637/jss.v011.i08

Christophe Biernacki, Gilles Celeux and Gerard Govaert. Choosing
starting values for the EM algorithm for getting the highest
likelihood in multivariate Gaussian mixture models. *Computational
Statistics & Data Analysis*, **41**(3--4), 561--575, 2003.

Theresa Scharl, Bettina Gruen and Friedrch Leisch. Mixtures of
regression models for time-course gene expression data: Evaluation of
initialization and random effects. *Bioinformatics*,
**26**(3), 370--377, 2010.

##### Examples

```
# NOT RUN {
data("Nclus", package = "flexmix")
## try 2 times for k = 4
set.seed(511)
ex1 <- initFlexmix(Nclus~1, k = 4, model = FLXMCmvnorm(diagonal = FALSE),
nrep = 2)
ex1
## now 2 times each for k = 2:5, specify control parameter
ex2 <- initFlexmix(Nclus~1, k = 2:5, model = FLXMCmvnorm(diagonal = FALSE),
control = list(minprior = 0), nrep = 2)
ex2
plot(ex2)
## get BIC values
BIC(ex2)
## get smallest model
getModel(ex2, which = 1)
## get model with 3 components
getModel(ex2, which = "3")
## get model with smallest ICL (here same as for AIC and BIC: true k = 4)
getModel(ex2, which = "ICL")
## now 1 time each for k = 2:5, with larger minimum prior
ex3 <- initFlexmix(Nclus~1, k = 2:5,
model = FLXMCmvnorm(diagonal = FALSE),
control = list(minprior = 0.1), nrep = 1)
ex3
## keep only maximum likelihood solution for each unique number of
## fitted clusters:
unique(ex3)
# }
```

*Documentation reproduced from package flexmix, version 2.3-17, License: GPL (>= 2)*