Learn R Programming

flexCWM (version 1.1)

glmcwm: Fit for the Generalized Linear Mixed CWM

Description

Maximum likelihood fitting of the Generalized Linear Mixed Cluster-Weighted Model by using the EM algorithm.

Usage

glmcwm (Y, Xcont=NULL, Xcate=NULL, m=NULL, familyY="Gaussian", k=2, 
  mY=1, method="Nelder-Mead", initialization="random.soft", start.z=NULL, 
  iter.max=1000, threshold=1.0e-04,loglikplot, seed=NULL)

Arguments

Y
numerical vector for the response variable.
Xcont
matrix for the continuous covariates.
Xcate
matrix for the categorical covariates.
m
number of levels for each categorical variable in Xcate (starting by 1).
familyY
the exponential distribution used for Y|x in each cluster; it can be:
  • "Gaussian"
  • "Poisson"
  • "Binomial"
  • "Gamma"
Default value is "Gaussian".
k
a vector containing the numbers of clusters to be tried. The one with the lowest information criterion is selected. Default value is 2.
mY
When familyY="Binomial", it sets the sample size. Default value is 1 (Bernoulli distribution).
method
optimization method used in the M-step of the EM algorithm (see optim). Default value is "Nelder-Mead".
initialization
initialization strategy for the EM-algorithm. It can be:
  • "random.soft"
  • "random.hard"
  • "manual"
Default value is "random.soft".
start.z
matrix of soft or hard classification: it is used only if initialization="manual".
iter.max
maximum number of iterations in the EM-algorithm. Default value is 200.
threshold
threshold for Aitken acceleration procedure. Default value is 1.0e-04.
loglikplot
if TRUE, the log-likelihood values against the iterations are plotted. Default value FALSE.
seed
the seed for the random number generator, when random initializations are used; if NULL, current seed is not changed. Default value is NULL.

Value

  • This function returns a list of values related to the model selected. It contains:
  • Yresponse variable
  • Xcont,Xcatecovariates
  • familyYexponential distribution used for Y|x in each cluster
  • pnumber of covariates
  • knumber of groups
  • nsample size
  • nparnumber of parameters
  • mYsample size, used when familyY="Binomial"
  • priorweights for the mixture components
  • muXcovariates means
  • VarXcovariates variances
  • PXmarginal distribution of X for each cluster
  • betaregression coefficients
  • muYmean of Y
  • dispYdispersion parameter of Y
  • VarFunYvariance function of Y
  • VarYvariance of Y
  • nuYwhen familyY="Gamma", the gamma distribution is parameterized according to muY and nuY (see McCullagh, P. and Nelder, J. 1989)
  • PYconditional distribution of Y|x for each cluster
  • iter.stopnumber of iterations performed in EM algorithm
  • zmatrix of posterior probabilities
  • groupclassification vector
  • loglikfinal log-likelihood value
  • "AIC", "AICc", "AICu", "AIC3", "AWE", "BIC", "CAIC", "ICL"Information criteria
  • callan object of class call

References

Ingrassia, S., Minotti, S. C., and Vittadini, G. (2012). Local statistical modeling via the cluster-weighted approach with elliptical distributions. Journal of Classification, 29(3), 363-401. Ingrassia, S., Minotti, S. C., Punzo, A., and Vittadini, G. (2012). Generalized linear Gaussian cluster-weighted modeling. arXiv.org e-print 1211.1171, available at: http://arxiv.org/abs/1211.1171. McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. Chapman & Hall, Boca Raton, 2nd edition

See Also

flexCWM-package, tourism

Examples

Run this code
data(tourism)
Y <- tourism$overnights
X <- tourism$attendance
res <- glmcwm(Y=Y,Xcont=X,k=1:4,seed=1)

Run the code above in your browser using DataLab