Learn R Programming

glmtoolbox (version 0.1.12)

gnmgee: Fit Nonlinear Generalized Estimating Equations

Description

Produces an object of the class glmgee in which the main results of a Nonlinear Generalized Estimating Equation (GEE) fitted to the data are stored.

Usage

gnmgee(
  formula,
  family = gaussian(),
  offset = NULL,
  weights = NULL,
  id,
  waves,
  data,
  subset = NULL,
  corstr,
  corr,
  start = NULL,
  scale.fix = FALSE,
  scale.value = 1,
  toler = 1e-05,
  maxit = 50,
  trace = FALSE,
  ...
)

Value

an object of class glmgee in which the main results of the GEE model fitted to the data are stored, i.e., a list with components including

coefficientsa vector with the estimates of \(\beta_1,\ldots,\beta_p\),
fitted.valuesa vector with the estimates of \(\mu_{ij}\) for \(i=1,\ldots,n\) and \(j=1,\ldots,n_i\),
starta vector with the starting values used,
itera numeric constant with the number of iterations,
prior.weightsa vector with the values of \(\omega_{ij}\) for \(i=1,\ldots,n\) and \(j=1,\ldots,n_i\),
offseta vector with the values of \(z_{ij}\) for \(i=1,\ldots,n\) and \(j=1,\ldots,n_i\),
termsan object containing the terms objects,
loglikthe value of the quasi-log-likelihood function evaluated at the parameter
estimates and the observed data,
estfuna vector with the estimating equations evaluated at the parameter
estimates and the observed data,
formulathe formula,
levelsthe levels of the categorical regressors,
contrastsan object containing the contrasts corresponding to levels,
convergeda logical indicating successful convergence,
modelthe full model frame,
ya vector with the values of \(y_{ij}\) for \(i=1,\ldots,n\) and \(j=1,\ldots,n_i\),
familyan object containing the family object used,
linear.predictorsa vector with the estimates of \(g(\mu_{ij})\) for \(i=1,\ldots,n\) and \(j=1,\ldots,n_i\),
Ra matrix with the (robust) estimate of the variance-covariance,
corra matrix with the estimate of the working-correlation,
corstra character string specifying the working-correlation structure,
ida vector which identifies the subjects or clusters,
sizesa vector with the values of \(n_i\) for \(i=1,\ldots,n\),
callthe original function call,

Arguments

formula

a nonlinear model formula including variables and parameters, which is a symbolic description of the nonlinear predictor of the model to be fitted to the data.

family

an (optional) family object, that is, a list of functions and expressions for defining link and variance functions. Families (and links) supported are the same supported by glm using its family argument, that is, gaussian, binomial, poisson, Gamma, inverse.gaussian, and quasi. The family negative.binomial in the library MASS are also available. As default, the argument family is set to gaussian(identity).

offset

an (optional) numeric vector of length equal to the number of cases, which can be used to specify an a priori known component to be included in the linear predictor during fitting.

weights

an (optional) vector of positive "prior weights" to be used in the fitting process. The length of weights should be the same as the total number of observations.

id

a vector which identifies the subjects or clusters. The length of id should be the same as the number of observations.

waves

an (optional) positive integer-valued variable that is used to identify the order and spacing of observations within clusters. This argument is crucial when there are missing values and gaps in the data. As default, waves is equal to the integers from 1 to the size of each cluster.

data

an (optional) data frame in which to look for variables involved in the formula expression, as well as for variables specified in the arguments id and weights. The data are assumed to be sorted by id and time.

subset

an (optional) vector specifying a subset of observations to be used in the fitting process.

corstr

an (optional) character string which allows to specify the working-correlation structure. The available options are: "Independence", "Unstructured", "Stationary-M-dependent(m)", "Non-Stationary-M-dependent(m)", "AR-M-dependent(m)", "Exchangeable" and "User-defined", where m represents the lag of the dependence. As default, corstr is set to "Independence".

corr

an (optional) square matrix of the same dimension of the maximum cluster size containing the user specified correlation. This is only appropriate if corstr is specified to be "User-defined".

start

an (optional) vector of starting values for the parameters in the nonlinear predictor. When start is missing (and formula is not a self-starting model, see nls and selfStart), a very cheap guess for start is tried.

scale.fix

an (optional) logical variable. If TRUE, the scale parameter is fixed at the value of scale.value. As default, scale.fix is set to FALSE.

scale.value

an (optional) numeric value at which the scale parameter should be fixed. This is only appropriate if scale.fix=TRUE. As default, scale.value is set to 1.

toler

an (optional) positive value which represents the convergence tolerance. The convergence is reached when the maximum of the absolute relative differences between the values of the parameters in the nonlinear predictor in consecutive iterations of the fitting algorithm is lower than toler. As default, toler is set to 0.00001.

maxit

an (optional) integer value which represents the maximum number of iterations allowed for the fitting algorithm. As default, maxit is set to 50.

trace

an (optional) logical variable. If TRUE, output is produced for each iteration of the estimating algorithm.

...

further arguments passed to or from other methods.

Details

The values of the multivariate response variable measured on \(n\) subjects or clusters, denoted by \(y_{i}=(y_{i1},\ldots,y_{in_i})^{\top}\) for \(i=1,\ldots,n\), are assumed to be realizations of independent random vectors denoted by \(Y_{i}=(Y_{i1},\ldots,Y_{in_i})^{\top}\) for \(i=1,\ldots,n\). The random variables associated to the \(i\)-th subject or cluster, \(Y_{ij}\) for \(j=1,\ldots,n_i\), are assumed to satisfy \(\mu_{ij}=\) E\((Y_{ij})\),Var\((Y_{ij})=\frac{\phi}{\omega_{ij}}\)V\((\mu_{ij})\) and Corr\((Y_{ij},Y_{ik})=r_{jk}(\rho)\), where \(\phi>0\) is the dispersion parameter, V\((\mu_{ij})\) is the variance function, \(\omega_{ij}>0\) is a known weight, and \(\rho=(\rho_1,\ldots,\rho_q)^{\top}\) is a parameter vector. In addition, \(\mu_{ij}\) is assumed to be dependent on the regressors vector \(x_{ij}\) by \(g(\mu_{ij})=z_{ij} + m(x_{ij},\beta)\), where \(g(\cdot)\) is the link function, \(z_{ij}\) is a known offset, \(\beta=(\beta_1,\ldots,\beta_p)^{\top}\) is a vector of regression parameters and \(m(x_{ij},\beta)\) is a known nonlinear function of \(\beta\). The parameter estimates are obtained by iteratively solving the estimating equations described by Liang and Zeger (1986).

If the maximum cluster size is 6 and for a cluster of size 4 the value of waves is set to 2, 4, 5, 6, then it means that the data at times 1 and 3 are missing, which should be taken into account by gnmgee when the structure of the correlation matrix is assumed to be "Unstructured", "Stationary-M-dependent", "Non-Stationary-M-dependent" or "AR-M-dependent". If in this scenario waves is not specified then gnmgee assumes that the available data for this cluster were taken at times 1, 2, 3 and 4.

A set of standard extractor functions for fitted model objects is available for objects of class glmgee, including methods to generic functions such as print, summary, model.matrix, estequa, coef, vcov, logLik, fitted, confint and predict. In addition, the model may be assessed using functions such as anova.glmgee, residuals.glmgee, dfbeta.glmgee, cooks.distance.glmgee, tidy.glmgee and glance.glmgee.

References

Liang K.Y., Zeger S.L. (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13-22.

Zeger S.L., Liang K.Y. (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121-130.

Hardin J.W., Hilbe J.M. (2013) Generalized Estimating Equations. Chapman & Hall, London.

Vanegas L.H., Rondon L.M., Paula G.A. (2023) Generalized Estimating Equations using the new R package glmtoolbox. The R Journal 15:105-133.

See Also

glmgee, wglmgee

Examples

Run this code
###### Example 1: Orange trees grown at Riverside, California
data(Oranges)
mod <- Trunk ~ b1/(1 + exp((b2-Days)/b3))
start <- c(b1=200,b2=760,b3=375)
fit1 <- gnmgee(mod, start=start, id=Tree, family=Gamma(identity), corstr="Exchangeable",
               data=Oranges)
summary(fit1, corr.digits=2)

mod <- Trunk ~ SSlogis(Days,b1,b2,b3)
fit2 <- gnmgee(mod, id=Tree, family=Gamma(identity), corstr="Exchangeable", data=Oranges)
summary(fit2, corr.digits=2)

###### Example 2: Growth of Paramecium aurelium
data(paramecium)
fit2 <- gnmgee(Number ~ exp(alpha - exp(beta - gamma*Days)), id=Colony, family=poisson(log),
         start=c(alpha=1.85,beta=0.7,gamma=0.35), corstr="AR-M-dependent(1)", data=paramecium)
summary(fit2, corr.digits=2)

Run the code above in your browser using DataLab