This function selects the explanatory variables, the number of mixture components and estimates the parameters of a conditional Gaussian mixture model using a stepwise algorithm. At the first iteration, the SMEM algorithm is performed to update the number of components and the parameters of the initial model. Then each following iteration consists in adding or removing a candidate explanatory variable, before re-estimating the model with the SMEM algorithm. The selected add or remove operation is the one that maximizes a conditional scoring function (after the re-estimation process). The stepwise algorithm stops if none of the candidate operations improves the score.
stepwise(
gmm,
data,
y = rownames(gmm$mu)[1],
x_cand = setdiff(colnames(data), y),
score = "bic",
add = TRUE,
remove = TRUE,
min_x = 0,
max_x = Inf,
max_iter_step = 10,
verbose = FALSE,
...
)
A list with elements:
The final gmm
object.
A numeric matrix containing the posterior probabilities for each observation.
A numeric vector containing the sequence of scores measured initially and after each iteration.
A character vector containing the sequence of add and remove operations performed at each iteration.
An initial object of class gmm
.
A data frame or numeric matrix containing the data used in the
stepwise algorithm. Its columns must explicitly be named after the variables
of gmm
and the candidate explanatory variables, and must not contain
missing values.
A character vector containing the dependent variables (by default
the first variable of gmm
).
A character vector containing the candidate explanatory
variables for addition or removal (by default all the column names of
data
except y
). If variables already in gmm
are not
candidates, they cannot be removed.
A character string ("aic"
, "bic"
or
"loglik"
) corresponding to the scoring function.
A logical value indicating whether add operations are allowed (if
FALSE
, no variable can be added).
A logical value indicating whether remove operations are
allowed (if FALSE
, no variable can be removed).
A non-negative integer corresponding to the minimum number of explanatory variables.
A non-negative integer corresponding to the maximum number of explanatory variables.
A non-negative integer corresponding to the maximum number of iterations.
A logical value indicating whether iterations in progress are displayed.
Additional arguments passed to function smem
.
em
, smem
# \donttest{
data(data_body)
gmm_1 <- add_var(NULL, "WAIST")
res_step <- stepwise(gmm_1, data_body, verbose = TRUE, max_comp = 3)# }
Run the code above in your browser using DataLab