EMGLLF: EMGLLF

Description

Run a generalized EM algorithm developped for mixture of Gaussian regression models with variable selection by an extension of the Lasso estimator (regularization parameter lambda). Reparametrization is done to ensure invariance by homothetic transformation. It returns a collection of models, varying the number of clusters and the sparsity in the regression mean.

Usage

EMGLLF(
  phiInit,
  rhoInit,
  piInit,
  gamInit,
  mini,
  maxi,
  gamma,
  lambda,
  X,
  Y,
  eps,
  fast
)

Arguments

phiInit

an initialization for phi

rhoInit

an initialization for rho

piInit

an initialization for pi

gamInit

initialization for the a posteriori probabilities

mini

integer, minimum number of iterations in the EM algorithm, by default = 10

maxi

integer, maximum number of iterations in the EM algorithm, by default = 100

gamma

integer for the power in the penaly, by default = 1

lambda

regularization parameter in the Lasso estimation

matrix of covariates (of size n*p)

matrix of responses (of size n*m)

eps

real, threshold to say the EM algorithm converges, by default = 1e-4

fast

boolean to enable or not the C function call

Value

A list (corresponding to the model collection) defined by (phi,rho,pi,llh,S,affec): phi : regression mean for each cluster, an array of size p*m*k rho : variance (homothetic) for each cluster, an array of size m*m*k pi : proportion for each cluster, a vector of size k llh : log likelihood with respect to the training set S : selected variables indexes, an array of size p*m*k affec : cluster affectation for each observation (of the training set)