clmm2: Cumulative link mixed models

Description

Fits cumulative link mixed models, i.e. cumulative link models with random effects via the Laplace approximation or the standard and the adaptive Gauss-Hermite quadrature approximation. The functionality in clm2 is also implemented here. Currently only a single random term is allowed in the location-part of the model.

A new implementation is available in clmm that allows for more than one random effect.

Usage

clmm2(location, scale, nominal, random, data, weights, start, subset,
     na.action, contrasts, Hess = FALSE, model = TRUE, sdFixed,
     link = c("logistic", "probit", "cloglog", "loglog",
     "cauchit", "Aranda-Ordaz", "log-gamma"), lambda,
     doFit = TRUE, control, nAGQ = 1,
     threshold = c("flexible", "symmetric", "equidistant"), ...)

Arguments

location

as in clm2.

scale

as in clm2.

nominal

as in clm2.

random

a factor for the random effects in the location-part of the model.

data

as in clm2.

weights

as in clm2.

start

initial values for the parameters in the format c(alpha, beta, log(zeta), lambda, log(stDev)) where stDev is the standard deviation of the random effects.

subset

as in clm2.

na.action

as in clm2.

contrasts

as in clm2.

Hess

logical for whether the Hessian (the inverse of the observed information matrix) should be computed. Use Hess = TRUE if you intend to call summary or vcov on the fit and Hess = FALSE in all other instances to save computing time.

model

as in clm2.

sdFixed

If sdFixed is specified (a positive scalar), a model is fitted where the standard deviation for the random term is fixed at the value of sdFixed. If sdFixed is left unspecified, the standard deviation of the random term is estimated from data.

link

as in clm2.

lambda

as in clm2.

doFit

as in clm2 although it can also be one of c("no", "R" "C"), where "R" use the R-implementation for fitting, "C" (default) use C-implementation for fitting and "no" behaves as FALSE and returns the environment.

control

a call to clmm2.control.

threshold

as in clm2.

nAGQ

the number of quadrature points to be used in the adaptive Gauss-Hermite quadrature approximation to the marginal likelihood. Defaults to 1 which leads to the Laplace approximation. An odd number of quadrature points is encouraged and 3, 5 or 7 are usually enough to achive high precision. Negative values give the standard, i.e. non-adaptive Gauss-Hermite quadrature.

…

additional arguments are passed on to clm2.control and possibly further on to the optimizer, which can lead to surprising error or warning messages when mistyping arguments etc.

Value

If doFit = FALSE the result is an environment representing the model ready to be optimized. If doFit = TRUE the result is an object of class "clmm2" with the following components:

stDev

the standard deviation of the random effects.

Niter

the total number of iterations in the Newton updates of the conditional modes of the random effects.

grFac

the grouping factor defining the random effects.

nAGQ

the number of quadrature points used in the adaptive Gauss-Hermite Quadrature approximation to the marginal likelihood.

ranef

the conditional modes of the random effects, sometimes referred to as "random effect estimates".

condVar

the conditional variances of the random effects at their conditional modes.

beta

the parameter estimates of the location part.

zeta

the parameter estimates of the scale part on the log scale; the scale parameter estimates on the original scale are given by exp(zeta).

Alpha

vector or matrix of the threshold parameters.

Theta

vector or matrix of the thresholds.

vector of threshold parameters, which, given a threshold function (e.g. "equidistant"), and possible nominal effects define the class boundaries, Theta.

lambda

the value of lambda if lambda is supplied or estimated, otherwise missing.

coefficients

the coefficients of the intercepts (theta), the location (beta), the scale (zeta), and the link function parameter (lambda).

df.residual

the number of residual degrees of freedoms, calculated using the weights.

fitted.values

vector of fitted values conditional on the values of the random effects. Use predict to get the fitted values for a random effect of zero. An observation here is taken to be each of the scalar elements of the multinomial table and not a multinomial vector.

convergence

TRUE if the optimizer terminates wihtout error and FALSE otherwise.

gradient

vector of gradients for the unit-variance random effects at their conditional modes.

optRes

list with results from the optimizer. The contents of the list depends on the choice of optimizer.

logLik

the log likelihood of the model at optimizer termination.

Hessian

if the model was fitted with Hess = TRUE, this is the Hessian matrix of the parameters at the optimum.

scale

model.frame for the scale model.

location

model.frame for the location model.

nominal

model.frame for the nominal model.

edf

the (effective) number of degrees of freedom used by the model.

start

the starting values.

method

character, the optimizer.

the response variable.

lev

the names of the levels of the response variable.

nobs

the (effective) number of observations, calculated as the sum of the weights.

threshold

character, the threshold function used in the model.

estimLambda

1 if lambda is estimated in one of the flexible link functions and 0 otherwise.

link

character, the link function used in the model.

call

the matched call.

contrasts

contrasts applied to terms in location and scale models.

na.action

the function used to filter missing data.

Details

There are methods for the standard model-fitting functions, including summary, vcov, profile, plot.profile, confint, anova, logLik, predict and an extractAIC method.

A Newton scheme is used to obtain the conditional modes of the random effects for Laplace and AGQ approximations, and a non-linear optimization is performed over the fixed parameter set to get the maximum likelihood estimates. The Newton scheme uses the observed Hessian rather than the expected as is done in e.g. glmer, so results from the Laplace approximation for binomial fits should in general be more precise - particularly for other links than the "logistic".

Core parts of the function are implemented in C-code for speed.

The function calls clm2 to up an environment and to get starting values.

References

Agresti, A. (2002) Categorical Data Analysis. Second edition. Wiley.

Examples

Run this code

# NOT RUN {
options(contrasts = c("contr.treatment", "contr.poly"))

## More manageable data set:
dat <- subset(soup, as.numeric(as.character(RESP)) <=  24)
dat$RESP <- dat$RESP[drop=TRUE]

m1 <- clmm2(SURENESS ~ PROD, random = RESP, data = dat, link="probit",
           Hess = TRUE, method="ucminf", threshold = "symmetric")

m1
summary(m1)
logLik(m1)
vcov(m1)
extractAIC(m1)
anova(m1, update(m1, location = SURENESS ~ 1, Hess = FALSE))
anova(m1, update(m1, random = NULL))

## Use adaptive Gauss-Hermite quadrature rather than the Laplace
## approximation:
update(m1, Hess = FALSE, nAGQ = 3)

## Use standard Gauss-Hermite quadrature:
update(m1, Hess = FALSE, nAGQ = -7)

##################################################################
## Binomial example with the cbpp data from the lme4-package:
if(require(lme4)) {
    cbpp2 <- rbind(cbpp[,-(2:3)], cbpp[,-(2:3)])
    cbpp2 <- within(cbpp2, {
        incidence <- as.factor(rep(0:1, each=nrow(cbpp)))
        freq <- with(cbpp, c(incidence, size - incidence))
    })

    ## Fit with Laplace approximation:
    fm1 <- clmm2(incidence ~ period, random = herd, weights = freq,
                 data = cbpp2, Hess = 1)
    summary(fm1)

    ## Fit with the adaptive Gauss-Hermite quadrature approximation:
    fm2 <- clmm2(incidence ~ period, random = herd, weights = freq,
                 data = cbpp2, Hess = 1, nAGQ = 7)
    summary(fm2)
}

# }

Run the code above in your browser using DataLab