Learn R Programming

VGAM (version 1.1-14)

cumulative: Ordinal Regression with Cumulative Probabilities

Description

Fits a cumulative link regression model to a (preferably ordered) factor response.

Usage

cumulative(link = "logitlink", parallel = FALSE,
    reverse = FALSE, multiple.responses = FALSE,
    ynames = FALSE, Thresh = NULL, Trev = reverse,
    Tref = if (Trev) "M" else 1, Intercept = NULL,
    whitespace = FALSE)

Arguments

Value

An object of class "vglmff"

(see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Details

In this help file the response \(Y\) is assumed to be a factor with ordered values \(1,2,\dots,J+1\). Hence \(M\) is the number of linear/additive predictors \(\eta_j\); for cumulative() one has \(M=J\).

This VGAM family function fits the class of cumulative link models to (hopefully) an ordinal response. By default, the non-parallel cumulative logit model is fitted, i.e., $$\eta_j = logit(P[Y \leq j])$$ where \(j=1,2,\dots,M\) and the \(\eta_j\) are not constrained to be parallel. This is also known as the non-proportional odds model. If the logit link is replaced by a complementary log-log link (clogloglink) then this is known as the proportional-hazards model.

In almost all the literature, the constraint matrices associated with this family of models are known. For example, setting parallel = TRUE will make all constraint matrices (except for the intercept) equal to a vector of \(M\) 1's. If the constraint matrices are equal, unknown and to be estimated, then this can be achieved by fitting the model as a reduced-rank vector generalized linear model (RR-VGLM; see rrvglm). Currently, reduced-rank vector generalized additive models (RR-VGAMs) have not been implemented here.

References

Agresti, A. (2013). Categorical Data Analysis, 3rd ed. Hoboken, NJ, USA: Wiley.

Agresti, A. (2010). Analysis of Ordinal Categorical Data, 2nd ed. Hoboken, NJ, USA: Wiley.

McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman & Hall.

Pujol-Rigol, S., Fernandez, D. and Casals, M. (2025). A systematic review and comparative study of R packages for ordinal response regression models. WIREs Computational Statistics, 17, e70025.

Tutz, G. (2012). Regression for Categorical Data, Cambridge: Cambridge University Press.

Tutz, G. and Berger, M. (2022). Sparser ordinal regression models based on parametric and additive location-shift approaches. International Statistical Review, 90, 306--327. tools:::Rd_expr_doi("10.1111/insr.12484").

Yee, T. W. (2010). The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1--34. tools:::Rd_expr_doi("10.18637/jss.v032.i10").

Yee, T. W. and Wild, C. J. (1996). Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481--493.

See Also

propodds, constraints, CM.ones, CM.equid, R2latvar, ordsup, prplot, margeff, acat, cratio, sratio, multinomial, CommonVGAMffArguments, pneumo, budworm, Links, hdeff.vglm, logitlink, probitlink, clogloglink, cauchitlink, logistic1.

Examples

Run this code
# Proportional odds model (p.179) of McCullagh and Nelder (1989)
pneumo <- transform(pneumo, let = log(exposure.time))
(fit <- vglm(cbind(normal, mild, severe) ~ let,
             cumulative(parallel = TRUE, reverse = TRUE), pneumo))
depvar(fit)  # Sample proportions (good technique)
fit@y        # Sample proportions (bad technique)
weights(fit, type = "prior")  # Number of observations
coef(fit, matrix = TRUE)
constraints(fit)  # Constraint matrices
apply(fitted(fit), 1, which.max)  # Classification
apply(predict(fit, newdata = pneumo, type = "response"),
      1, which.max)  # Classification
R2latvar(fit)

# Check that the model is linear in let ----------------------
fit2 <- vgam(cbind(normal, mild, severe) ~ s(let, df = 2),
             cumulative(reverse = TRUE), data = pneumo)
if (FALSE) {
 plot(fit2, se = TRUE, overlay = TRUE, lcol = 1:2, scol = 1:2) }

# Check the proportional odds assumption with a LRT -----
(fit3 <- vglm(cbind(normal, mild, severe) ~ let,
              cumulative(parallel = FALSE, reverse = TRUE), pneumo))
pchisq(2 * (logLik(fit3) - logLik(fit)), df =
       length(coef(fit3)) - length(coef(fit)), lower.tail = FALSE)
lrtest(fit3, fit)  # More elegant

# A factor() version of fit ----------------------------
# This is in long format (cf. wide format above)
Nobs <- round(depvar(fit) * c(weights(fit, type = "prior")))
sumNobs <- colSums(Nobs)  # apply(Nobs, 2, sum)

pneumo.long <-
  data.frame(symptoms = ordered(rep(rep(colnames(Nobs), nrow(Nobs)),
                                        times = c(t(Nobs))),
                                levels = colnames(Nobs)),
             let = rep(rep(with(pneumo, let), each = ncol(Nobs)),
                       times = c(t(Nobs))))
with(pneumo.long, table(let, symptoms))  # Should be same as pneumo


(fit.long1 <- vglm(symptoms ~ let, data = pneumo.long, trace = TRUE,
                   cumulative(parallel = TRUE, reverse = TRUE)))
coef(fit.long1, matrix = TRUE)  # cf. coef(fit, matrix = TRUE)
# Could try using mustart if fit.long1 failed to converge.
mymustart <- matrix(sumNobs / sum(sumNobs),
                    nrow(pneumo.long), ncol(Nobs), byrow = TRUE)
fit.long2 <- vglm(symptoms ~ let, mustart = mymustart,
                  cumulative(parallel = TRUE, reverse = TRUE),
                  data = pneumo.long, trace = TRUE)
coef(fit.long2, matrix = TRUE)  # cf. coef(fit, matrix = TRUE)

Run the code above in your browser using DataLab