Solving the generalized estimating equations for correlated nominal multinomial responses assuming a baseline category logit model for the marginal probabilities.
nomLORgee(formula, data, id = id, repeated = NULL,
bstart = NULL, LORstr = "time.exch", LORem = "3way", LORterm = NULL,
add = 0, homogeneous = TRUE, control = LORgee.control(),
ipfp.ctrl = ipfp.control(), IM = "solve")a formula expression as for other regression models for multinomial responses. An intercept term must be included.
an optional data frame containing the variables provided in formula, id and repeated.
a vector that identifies the clusters.
an optional vector that identifies the order of the observations within each cluster.
a vector that includes an initial estimate for the marginal regression parameter vector.
a character string that indicates the marginalized local odds ratios structure. Options include "independence", "time.exch", "RC" or "fixed".
a character string that indicates if the marginalized local odds ratios structure is estimated simultaneously ("3way") or independently at each level pair of repeated ("2way").
a matrix that satisfies the user-defined local odds ratios structure. It is ignored unless LORstr="fixed".
a positive constant to be added at each cell of the full marginalized contingency table in the presence of zero observed counts.
a logical that indicates homogeneous score parameters when LORstr="time.exch" or "RC".
a vector that specifies the control variables for the GEE solver.
a vector that specifies the control variables for the function ipfp.
a character string that indicates the method used for inverting a matrix. Options include "solve", "qr.solve" or "cholesky".
Returns an object of the class "LORgee". This has components:
the matched call.
title for the GEE model.
the current version of the GEE solver.
the marginal link function.
the marginalized local odds ratios structure variables.
the terms structure describing the marginal model.
the contrasts used for the factors.
the number of observations.
the values of the convergence variables.
the estimated regression parameter vector of the marginal model.
the estimated linear predictor of the marginal regression model. The \(j\)-th column corresponds to the \(j\)-th response category.
the estimated fitted values of the marginal regression model. The \(j\)-th column corresponds to the \(j\)-th response category.
the residuals of the marginal regression model based on the binary responses. The \(j\)-th column corresponds to the \(j\)-th response category.
the multinomial response variables.
the id variable.
the number of clusters.
the number of observations within each cluster.
the estimated "robust" covariance matrix.
the estimated "naive" or "model-based" covariance matrix.
the regression coefficients' symbolic names.
the number of observed response categories.
the levels of the repeated variable.
the control values for the GEE solver.
the control values for the function ipfp.
the method used for inverting matrices.
the value used for add.
the p-value based on a Wald test that no covariates are statistically significant.
The data must be provided in case level or equivalently in `long' format. See details about the `long' format in the function reshape.
A term of the form offset(expression) is allowed in the right hand side of formula.
The default set for the response categories is \(\{1,\ldots,J\}\), where \(J>2\) is the maximum observed response category. If otherwise, the function recodes the observed response categories onto this set.
The \(J\)-th response category is treated as baseline.
The default set for the id labels is \(\{1,\ldots,N\}\), where \(N\) is the sample size. If otherwise, the function recodes the given labels onto this set.
The argument repeated can be ignored only when data is written in such a way that the \(t\)-th observation in each cluster is recorded at the \(t\)-th measurement occasion. If this is not the case, then the user must provide repeated. The suggested set for the levels of repeated is \(\{1,\ldots,T\}\), where \(T\) is the number of observed levels. If otherwise, the function recodes the given levels onto this set.
The variables id and repeated do not need to be pre-sorted. Instead the function reshapes data in an ascending order of id and repeated.
The fitted marginal baseline category logit model is $$log \frac{Pr(Y_{it}=j |x_{it})}{Pr(Y_{it}=J |x_{it})}=\beta_{0j} +\beta^{'}_j x_{it}$$ where \(Y_{it}\) is the \(t\)-th multinomial response for cluster \(i\), \(x_{it}\) is the associated covariates vector, \(\beta_{0j}\) is the \(j\)-th response category specific intercept and \(\beta_{j}\) is the \(j\)-th response category specific parameter vector.
The LORterm argument must be an \(L\) x \(J^2\) matrix, where \(L\) is the number of level pairs of repeated. These are ordered as \((1,2), (1,3), ...,(1,T), (2,3),...,(T-1,T)\) and the rows of LORterm are supposed to preserve this order. Each row is assumed to contain the vectorized form of a probability table that satisfies the desired local odds ratios structure.
Touloumis, A. (2011). GEE for multinomial responses. PhD dissertation, University of Florida.
Touloumis, A., Agresti, A. and Kateri, M. (2013). GEE for multinomial responses using a local odds ratios parameterization. Biometrics, 69, 633-640.
Touloumis, A. (2015). R Package multgee: A Generalized Estimating Equations Solver for Multinomial Responses. Journal of Statistical Software, 64, 1-14.
For an ordinal response scale use the function ordLORgee.
# NOT RUN {
## See the interpretation in Touloumis (2011).
data(housing)
fitmod <- nomLORgee(y~factor(time)*sec,data=housing,id=id, repeated=time)
summary(fitmod)
# }
Run the code above in your browser using DataLab