Conducts a greedy forward stepwise search to identify the optimal MoEClust
model according to some criterion
. Components and/or gating
covariates and/or expert
covariates are added to new MoE_clust
fits at each step, while each step is evaluated for all valid modelNames
.
MoE_stepwise(data,
network.data = NULL,
gating = NULL,
expert = NULL,
modelNames = NULL,
fullMoE = FALSE,
noise = FALSE,
initialModel = NULL,
initialG = NULL,
stepG = TRUE,
criterion = c("bic", "icl", "aic"),
equalPro = c("all", "both", "yes", "no"),
noise.gate = c("all", "both", "yes", "no"),
verbose = interactive(),
...)
An object of class "MoECompare"
containing information on all visited models and the optimal model (accessible via x$optimal
).
A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.
An optional matrix or data frame in which to look for the covariates specified in the gating
&/or expert
networks, if any. Must include column names. Columns in network.data
corresponding to columns in data
will be automatically removed. While a single covariate can be supplied as a vector (provided the `$
' operator or `[]
' subset operator are not used), it is safer to supply a named 1-column matrix or data frame in this instance.
A vector giving the names of columns in network.data
used to define the scope of the gating network. By default, the initial model will contain no covariates (unless initialModel
is supplied with gating covariates), thereafter all variables in gating
(save for those in initialModel
, if any) will be considered for inclusion where appropriate.
If gating
is not supplied (or set to NULL
), all variables in network.data
will be considered for the gating network. gating
can also be supplied as NA
, in which case no gating network covariates will ever be considered (save for those in initialModel
, if any). Supplying gating
and expert
can be used to ensure different subsets of covariates enter different parts of the model.
A vector giving the names of columns in network.data
used to define the scope of the expert network. By default, the initial model will contain no covariates (unless initialModel
is supplied with expert covariates), thereafter all variables in expert
(save for those in initialModel
, if any) will be considered for inclusion where appropriate.
If expert
is not supplied (or set to NULL
), all variables in network.data
will be considered for the expert network. expert
can also be supplied as NA
, in which case no expert network covariates will ever be considered (save for those in initialModel
, if any). Supplying expert
and gating
can be used to ensure different subsets of covariates enter different parts of the model.
A character string of valid model names, to be used to restrict the size of the search space, if desired. By default, all valid model types are explored. Rather than considering the changing of the model type as an additional step, every step is evaluated over all entries in modelNames
. See MoE_clust
for more details.
Note that if initialModel
is supplied (see below), modelNames
will be augmented with initialModel$modelName
if needs be.
A logical which, when TRUE
, ensures that only models where the same covariates enter both parts of the model (the gating and expert networks) are considered. This restricts the search space to exclude models where covariates differ across networks. Thus, the search is likely to be faster, at the expense of potentially missing out on optimal models. Defaults to FALSE
.
Furthermore, when TRUE
, the set of candidate covariates is automatically taken to be the union of the named covariates in gating
and expert
, for convenience. In other words, gating=NA
will only work if expert=NA
also, and both should be set to NULL
in order to consider all potential covariates.
In addition, caution is advised using this argument in conjunction with initialModel
, which must satisfy the constraint that the same set of covariates be used in both parts of the model, for initial models where gating covariates are allowable. Finally, note that this argument does not preclude a model with only expert covariates included if the number of components is such that the inclusion of gating covariates is infeasible.
A logical indicating whether to assume all models contain an additional noise component (TRUE
) or not (FALSE
, the default). If initialModel
or initialG
is not specified, the search starts from a G=0
noise-only model when noise
is TRUE
, otherwise the search starts from a G=1
model with no covariates when noise
is FALSE
. See MoE_control
for more details. Note, however, that if the model specified in initialModel
contains a noise component, the value of the noise
argument will be overridden to TRUE
; similarly, if the initialModel
model does not contain a noise component, noise
will be overridden to FALSE
.
An object of class "MoEClust"
generated by MoE_clust
or an object of class "MoECompare"
generated by MoE_compare
. This gives the initial model to use at the first step of the selection algorithm, to which components and/or covariates etc. can be added. Especially useful if the model is expected to have more than one component a priori (see initialG
below as an alternative). The initialModel
model must have been fitted to the same data in data
.
If initialModel
is not specified, the search starts from a G=0
noise-only model when noise
is TRUE
, otherwise the search starts from a G=1
model with no covariates when noise
is FALSE
. If initialModel
is supplied and it contains a noise component, only models with a noise component will be considered thereafter (i.e. the noise
argument can be overridden by the initialModel
argument). If initialModel
contains gating &/or expert covariates, these covariates will be included in all subsequent searches, with covariates in expert
and gating
still considered as candidates for additional inclusion, as normal.
However, while initialModel
can include covariates not specified in gating
&/or expert
, the initialModel$modelName
should be included in the specified modelNames
; if it is not, modelNames
will be forcibly augmented with initialModel$modelName
(as stated above). Furthermore, it is assumed that initialModel
is already optimal with respect to the model type. If it is not, the algorithm may be liable to converge to a sub-optimal model, and so a warning will be printed if the function suspects that this might be the case.
A single (positive) integer giving the number of mixture components (clusters) to initialise the stepwise search algorithm with. This is a simpler alternative to the initialModel
argument, to be used when the only prior knowledge relates to the number of components, and not other features of the model (e.g. the covariates which should be included). Consequently, initialG
is only relevant when initialModel
is not supplied. When neither initialG
nor initialModel
is specified, the search starts from a G=0
noise-only model when noise
is TRUE
, otherwise the search starts from a G=1
model with no covariates when noise
is FALSE
. See stepG
below for fixing the number of components at this initialG
value.
A logical indicating whether the algorithm should consider incrementing the number of components at each step. Defaults to TRUE
; use FALSE
when searching only over configurations with the same number of components is of interest. Setting stepG
to FALSE
is possible with or without specifying initialModel
or initialG
, but is primarily intended for use when one of these arguments is supplied, otherwise the algorithm will be stuck forever with only one component.
The model selection criterion used to determine the optimal action at each step. Defaults to "bic"
.
A character string indicating whether models with equal mixing proportions should be considered. "both"
means models with both equal and unequal mixing proportions will be considered, "yes"
means only models with equal mixing proportions will be considered, and "no"
means only models with unequal mixing proportions will be considered. Notably, no setting for equalPro
is enough to rule out models with gating
covariates from consideration.
The default ("all"
) is equivalent to "both"
with the addition that all possible mixing proportion constraints will be tried for the initialModel
(if any, provided it doesn't contain gating covariate(s)) or initialG
before adding a component or additional covariates; otherwise, this equalPro
argument only governs whether mixing proportion constraints are considered as components are added.
Considering "all"
(or "both"
) equal and unequal mixing proportion models increases the search space and the computational burden, but this argument becomes irrelevant after a model, if any, with gating network covariate(s) is considered optimal for a given step. The "all"
default is strongly recommended so that viable candidate models are not missed out on, particularly when initialModel
or initialG
are given. However, this does not guarantee that an optimal model will not be skipped; if equalPro
is restricted via "yes"
or "no"
, a suboptimal model at one step may ultimately lead to a better final model, in some edge cases. See MoE_control
for more details.
A character string indicating whether models where the gating network for the noise component depends on covariates are considered. "yes"
means only models where this is the case will be considered, "no"
means only models for which the noise component's mixing proportion is constant will be considered and "both"
means both of these scenarios will be considered.
The default ("all"
) is equivalent to "both"
with the addition that all possible gating network noise settings will be tried for the initialModel
(if any, provided it contains gating covariates and a noise component) before adding a component or additional covariates; otherwise, this noise.gate
argument only governs the inclusion/exclusion of this constraint as components or covariates are added.
Considering "all"
(or "both"
) settings increases the search space and the computational burden, but this argument is only relevant when noise=TRUE
and gating
covariates are being considered. The "all"
default is strongly recommended so that viable candidate models are not missed out on, particularly when initialModel
or initialG
are given. However, this does not guarantee that an optimal model will not be skipped; if noise.gate
is restricted via "yes"
or "no"
, a suboptimal model at one step may ultimately lead to a better final model, in some edge cases. See MoE_control
for more details.
Logical indicating whether to print messages pertaining to progress to the screen during fitting. By default is TRUE
if the session is interactive, and FALSE
otherwise. If FALSE
, warnings and error messages will still be printed to the screen, but everything else will be suppressed.
Additional arguments to MoE_control
, except for those arguments of the same name which are already listed here, e.g. equalPro
and noise.gate
. Note that these arguments will be supplied to all candidate models for every step. For arguments specific to MoE_control
(e.g. stopping
, algo
, etc.), it is recommended to run MoE_stepwise
multiple times while toggling these arguments, if desired.
Keefe Murphy - <keefe.murphy@mu.ie>
The arguments modelNames
, equalPro
, and noise.gate
are provided for computational convenience. They can be used to reduce the number of models under consideration at each stage.
The same is true of the arguments gating
and expert
, which can each separately (or jointly, if fullMoE
is TRUE
) be made to consider all variables in network.data
, or a subset, or none at all.
Finally, initialModel
or initialG
can be used to kick-start the search algorithm by incorporating prior information in a more direct way; in the latter case, only in the form of the number of components; in the former case, a full model with a given number of components, certain included gating and expert network covariates, and a certain model type can give the model an even more informed head start. In either case, the stepG
argument can be used to fix the number of components and only search over different configurations of covariates.
Without any prior information, it is best to accept the defaults at the expense of a longer run-time.
Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. Advances in Data Analysis and Classification, 14(2): 293-325. <tools:::Rd_expr_doi("10.1007/s11634-019-00373-8")>.
MoE_clust
, MoE_compare
, MoE_control
# data(CO2data)
# Search over all models where the single covariate can enter either network
# (mod1 <- MoE_stepwise(CO2data$CO2, CO2data[,"GNP", drop=FALSE]))
#
# data(ais)
# Only look for EVE & EEE models with at most one expert network covariate
# Do not consider any gating covariates and only consider models with equal mixing proportions
# (mod2 <- MoE_stepwise(ais[,3:7], ais, gating=NA, expert="sex",
# equalPro="yes", modelNames=c("EVE", "EEE")))
#
# Look for models with noise & only those where the noise component's mixing proportion is constant
# Speed up the search with an initialModel, fix G, and restrict the covariates & model type
# init <- MoE_clust(ais[,3:7], G=2, modelNames="EEE",
# expert= ~ sex, network.data=ais, tau0=0.1)
# (mod3 <- MoE_stepwise(ais[,3:7], ais, noise=TRUE, expert="sex",
# gating=c("SSF", "Ht"), noise.gate="no",
# initialModel=init, stepG=FALSE, modelNames="EEE"))
#
# Compare both sets of results (with & without a noise component) for the ais data
# (comp1 <- MoE_compare(mod2, mod3, optimal.only=TRUE))
# comp1$optimal
#
# Target a model for the AIS data which is optimal in terms of ICL, without any restrictions
# mod4 <- MoE_stepwise(ais[,3:7], ais, criterion="icl")
#
# This gets stuck at a G=1 model, so specify an initial G value as a head start
# mod5 <- MoE_stepwise(ais[,3:7], ais, criterion="icl", initialG=2)
#
# Check that specifying an initial G value enables a better model to be found
# (comp2 <- MoE_compare(mod4, mod5, optimal.only=TRUE, criterion="icl"))
# Finally, restrict the search to full MoE models only
# Notice that the candidate covariates are the union of gating and expert
# Notice also that the algorithm initially traverses models with only
# expert covariates when the inclusion of gating covariates is infeasible
# mod6 <- MoE_stepwise(ais[,3:7], ais, fullMoE=TRUE, gating="BMI", expert="Bfat")
Run the code above in your browser using DataLab