nomLORgee: Marginal Models For Correlated Nominal Multinomial Responses

Description

Solving the generalized estimating equations for correlated nominal multinomial responses assuming a baseline category logit model for the marginal probabilities.

Usage

nomLORgee(formula, data, id = id, repeated = NULL, 
          bstart = NULL, LORstr = "time.exch", LORem = "3way", LORterm = NULL, 
          add = 0, homogeneous = TRUE, control = LORgee.control(), 
          ipfp.ctrl = ipfp.control(), IM = "solve")

Arguments

formula

a formula expression as for other regression models for multinomial responses. An intercept term must be included.

data

an optional data frame containing the variables provided in formula, id and repeated.

a vector that identifies the clusters.

repeated

an optional vector that identifies the order of the observations within each cluster.

bstart

a vector that includes an initial estimate for the marginal regression parameter vector.

LORstr

a character string that indicates the marginalized local odds ratios structure. Options include "independence", "time.exch", "RC" or "fixed".

LORem

a character string that indicates if the marginalized local odds ratios structure is estimated simultaneously ("3way") or independently at each level pair of repeated ("2way").

LORterm

a matrix that satisfies the user-defined local odds ratios structure. It is ignored unless LORstr="fixed".

add

a positive constant to be added at each cell of the full marginalized contingency table in the presence of zero observed counts.

homogeneous

a logical that indicates homogeneous score parameters when LORstr="time.exch" or "RC".

control

a vector that specifies the control variables for the GEE solver.

ipfp.ctrl

a vector that specifies the control variables for the function ipfp.

a character string that indicates the method used for inverting a matrix. Options include "solve", "qr.solve" or "cholesky".

Value

Returns an object of the class "LORgee". This has components:

call

the matched call.

title

title for the GEE model.

version

the current version of the GEE solver.

link

the marginal link function.

local.odds.ratios

the marginalized local odds ratios structure variables.

terms

the terms structure describing the marginal model.

contrasts

the contrasts used for the factors.

nobs

the number of observations.

convergence

the values of the convergence variables.

coefficients

the estimated regression parameter vector of the marginal model.

linear.pred

the estimated linear predictor of the marginal regression model. The $j$-th column corresponds to the $j$-th response category.

fitted.values

the estimated fitted values of the marginal regression model. The $j$-th column corresponds to the $j$-th response category.

residuals

the residuals of the marginal regression model based on the binary responses. The $j$-th column corresponds to the $j$-th response category.

the multinomial response variables.

the id variable.

max.id

the number of clusters.

clusz

the number of observations within each cluster.

robust.variance

the estimated "robust" covariance matrix.

naive.variance

the estimated "naive" or "model-based" covariance matrix.

xnames

the regression coefficients' symbolic names.

Details

The data must be provided in case level or equivalently in `long' format. See details about the `long' format in the function reshape.

A term of the form offset(expression) is allowed in the right hand side of formula.

The default set for the response categories is $\{1,\ldots,J\}$, where $J>2$ is the maximum observed response category. If otherwise, the function recodes the observed response categories onto this set.

The $J$-th response category is treated as baseline.

The default set for the id labels is $\{1,\ldots,N\}$, where $N$ is the sample size. If otherwise, the function recodes the given labels onto this set.

The argument repeated can be ignored only when data is written in such a way that the $t$-th observation in each cluster is recorded at the $t$-th measurement occasion. If this is not the case, then the user must provide repeated. The suggested set for the levels of repeated is $\{1,\ldots,T\}$, where $T$ is the number of observed levels. If otherwise, the function recodes the given levels onto this set.

The variables id and repeated do not need to be pre-sorted. Instead the function reshapes data in an ascending order of id and repeated.

The fitted marginal baseline category logit model is $$log \frac{Pr(Y_{it}=j |x_{it})}{Pr(Y_{it}=J |x_{it})}=\beta_{0j} +\beta^{'}_j x_{it}$$ where $Y_{it}$ is the $t$-th multinomial response for cluster $i$, $x_{it}$ is the associated covariates vector, $\beta_{0j}$ is the $j$-th response category specific intercept and $\beta_{j}$ is the $j$-th response category specific parameter vector.

The LORterm argument must be an $L$ x $J^2$ matrix, where $L$ is the number of level pairs of repeated. These are ordered as $(1,2), (1,3), ...,(1,T), (2,3),...,(T-1,T)$ and the rows of LORterm are supposed to preserve this order. Each row is assumed to contain the vectorized form of a probability table that satisfies the desired local odds ratios structure.

References

Touloumis, A. (2011). GEE for multinomial responses. PhD dissertation, University of Florida.

Touloumis, A., Agresti, A. and Kateri, M. (2013). GEE for multinomial responses using a local odds ratios parameterization. Biometrics, 69, 633-640.

Touloumis, A. (2015). R Package multgee: A Generalized Estimating Equations Solver for Multinomial Responses. Journal of Statistical Software, 64, 1-14.

Examples

Run this code

# NOT RUN {
## See the interpretation in Touloumis (2011).
data(housing)
fitmod <- nomLORgee(y~factor(time)*sec,data=housing,id=id, repeated=time)
summary(fitmod) 
# }

Run the code above in your browser using DataLab