glmcwm: Fit for the Generalized Linear Mixed CWM

Description

Maximum likelihood fitting of the Generalized Linear Mixed Cluster-Weighted Model by using the EM algorithm.

Usage

glmcwm (Y, Xcont=NULL, Xcate=NULL, m=NULL, familyY="Gaussian", k=2, 
  mY=1, method="Nelder-Mead", initialization="random.soft", start.z=NULL, 
  iter.max=1000, threshold=1.0e-04,loglikplot, seed=NULL)

Arguments

numerical vector for the response variable.

Xcont

matrix for the continuous covariates.

Xcate

matrix for the categorical covariates.

number of levels for each categorical variable in Xcate (starting by 1).

familyY

the exponential distribution used for Y|x in each cluster; it can be:

"Gaussian"
"Poisson"
"Binomial"
"Gamma"

Default value is "Gaussian".

a vector containing the numbers of clusters to be tried. The one with the lowest information criterion is selected. Default value is 2.

When familyY="Binomial", it sets the sample size. Default value is 1 (Bernoulli distribution).

method

optimization method used in the M-step of the EM algorithm (see optim). Default value is "Nelder-Mead".

initialization

initialization strategy for the EM-algorithm. It can be:

"random.soft"
"random.hard"
"manual"

Default value is "random.soft".

start.z

matrix of soft or hard classification: it is used only if initialization="manual".

iter.max

maximum number of iterations in the EM-algorithm. Default value is 200.

threshold

threshold for Aitken acceleration procedure. Default value is 1.0e-04.

loglikplot

if TRUE, the log-likelihood values against the iterations are plotted. Default value FALSE.

seed

the seed for the random number generator, when random initializations are used; if NULL, current seed is not changed. Default value is NULL.

Value

This function returns a list of values related to the model selected. It contains:
Yresponse variable
Xcont,Xcatecovariates
familyYexponential distribution used for Y|x in each cluster
pnumber of covariates
knumber of groups
nsample size
nparnumber of parameters
mYsample size, used when familyY="Binomial"
priorweights for the mixture components
muXcovariates means
VarXcovariates variances
PXmarginal distribution of X for each cluster
betaregression coefficients
muYmean of Y
dispYdispersion parameter of Y
VarFunYvariance function of Y
VarYvariance of Y
nuYwhen familyY="Gamma", the gamma distribution is parameterized according to muY and nuY (see McCullagh, P. and Nelder, J. 1989)
PYconditional distribution of Y|x for each cluster
iter.stopnumber of iterations performed in EM algorithm
zmatrix of posterior probabilities
groupclassification vector
loglikfinal log-likelihood value
"AIC", "AICc", "AICu", "AIC3", "AWE", "BIC", "CAIC", "ICL"Information criteria
callan object of class call

References

Ingrassia, S., Minotti, S. C., and Vittadini, G. (2012). Local statistical modeling via the cluster-weighted approach with elliptical distributions. Journal of Classification, 29(3), 363-401. Ingrassia, S., Minotti, S. C., Punzo, A., and Vittadini, G. (2012). Generalized linear Gaussian cluster-weighted modeling. arXiv.org e-print 1211.1171, available at: http://arxiv.org/abs/1211.1171. McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. Chapman & Hall, Boca Raton, 2nd edition

Examples

Run this code

data(tourism)
Y <- tourism$overnights
X <- tourism$attendance
res <- glmcwm(Y=Y,Xcont=X,k=1:4,seed=1)