Fits a cumulative link regression model to a (preferably ordered) factor response.
cumulative(link = "logit", parallel = FALSE, reverse = FALSE,
multiple.responses = FALSE, whitespace = FALSE)
A logical or formula specifying which terms have
equal/unequal coefficients.
See below for more information about the parallelism assumption.
The default results in what some people call the
generalized ordered logit model to be fitted.
If parallel = TRUE
then it does not apply to the intercept.
Logical.
By default, the cumulative probabilities used are
\(P(Y\leq 1)\), \(P(Y\leq 2)\),
…, \(P(Y\leq J)\).
If reverse
is TRUE
then
\(P(Y\geq 2)\), \(P(Y\geq 3)\), …,
\(P(Y\geq J+1)\) are used.
This should be set to TRUE
for link=
golf
,
polf
,
nbolf
.
For these links the cutpoints must be an increasing sequence;
if reverse = FALSE
for then the cutpoints must be an
decreasing sequence.
Logical.
Multiple responses? If TRUE
then the input should be
a matrix with values \(1,2,\dots,L\), where \(L=J+1\) is the
number of levels.
Each column of the matrix is a response, i.e., multiple responses.
A suitable matrix can be obtained from Cut
.
See CommonVGAMffArguments
for information.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
and vgam
.
No check is made to verify that the response is ordinal if the
response is a matrix;
see ordered
.
In this help file the response \(Y\) is assumed to be a factor
with ordered values \(1,2,\dots,J+1\).
Hence \(M\) is the number of linear/additive predictors
\(\eta_j\);
for cumulative()
one has \(M=J\).
This VGAM family function fits the class of
cumulative link models to (hopefully) an ordinal response.
By default, the non-parallel cumulative logit model is fitted, i.e.,
$$\eta_j = logit(P[Y \leq j])$$
where \(j=1,2,\dots,M\) and
the \(\eta_j\) are not constrained to be parallel.
This is also known as the non-proportional odds model.
If the logit link is replaced by a complementary log-log link
(cloglog
) then
this is known as the proportional-hazards model.
In almost all the literature, the constraint matrices associated
with this family of models are known. For example, setting
parallel = TRUE
will make all constraint matrices
(except for the intercept) equal to a vector of \(M\) 1's.
If the constraint matrices are equal, unknown and to be estimated, then
this can be achieved by fitting the model as a
reduced-rank vector generalized
linear model (RR-VGLM; see rrvglm
).
Currently, reduced-rank vector generalized additive models
(RR-VGAMs) have not been implemented here.
Agresti, A. (2013) Categorical Data Analysis, 3rd ed. Hoboken, NJ, USA: Wiley.
Agresti, A. (2010) Analysis of Ordinal Categorical Data, 2nd ed. Hoboken, NJ, USA: Wiley.
Dobson, A. J. and Barnett, A. (2008) An Introduction to Generalized Linear Models, 3rd ed. Boca Raton: Chapman & Hall/CRC Press.
McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Simonoff, J. S. (2003) Analyzing Categorical Data, New York: Springer-Verlag.
Yee, T. W. (2010) The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1--34. http://www.jstatsoft.org/v32/i10/.
Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481--493.
propodds
,
prplot
,
margeff
,
acat
,
cratio
,
sratio
,
multinomial
,
pneumo
,
Links
,
hdeff.vglm
,
logit
,
probit
,
cloglog
,
cauchit
,
golf
,
polf
,
nbolf
,
logistic1
.
# NOT RUN { # Fit the proportional odds model, p.179, in McCullagh and Nelder (1989) pneumo <- transform(pneumo, let = log(exposure.time)) (fit <- vglm(cbind(normal, mild, severe) ~ let, cumulative(parallel = TRUE, reverse = TRUE), data = pneumo)) depvar(fit) # Sample proportions (good technique) fit@y # Sample proportions (bad technique) weights(fit, type = "prior") # Number of observations coef(fit, matrix = TRUE) constraints(fit) # Constraint matrices apply(fitted(fit), 1, which.max) # Classification apply(predict(fit, newdata = pneumo, type = "response"), 1, which.max) # Classification # Check that the model is linear in let ---------------------- fit2 <- vgam(cbind(normal, mild, severe) ~ s(let, df = 2), cumulative(reverse = TRUE), data = pneumo) # } # NOT RUN { plot(fit2, se = TRUE, overlay = TRUE, lcol = 1:2, scol = 1:2) # } # NOT RUN { # Check the proportional odds assumption with a LRT ---------- (fit3 <- vglm(cbind(normal, mild, severe) ~ let, cumulative(parallel = FALSE, reverse = TRUE), data = pneumo)) pchisq(2 * (logLik(fit3) - logLik(fit)), df = length(coef(fit3)) - length(coef(fit)), lower.tail = FALSE) lrtest(fit3, fit) # More elegant # A factor() version of fit ---------------------------------- # This is in long format (cf. wide format above) Nobs <- round(depvar(fit) * c(weights(fit, type = "prior"))) sumNobs <- colSums(Nobs) # apply(Nobs, 2, sum) pneumo.long <- data.frame(symptoms = ordered(rep(rep(colnames(Nobs), nrow(Nobs)), times = c(t(Nobs))), levels = colnames(Nobs)), let = rep(rep(with(pneumo, let), each = ncol(Nobs)), times = c(t(Nobs)))) with(pneumo.long, table(let, symptoms)) # Should be same as pneumo (fit.long1 <- vglm(symptoms ~ let, data = pneumo.long, trace = TRUE, cumulative(parallel = TRUE, reverse = TRUE))) coef(fit.long1, matrix = TRUE) # Should be as coef(fit, matrix = TRUE) # Could try using mustart if fit.long1 failed to converge. mymustart <- matrix(sumNobs / sum(sumNobs), nrow(pneumo.long), ncol(Nobs), byrow = TRUE) fit.long2 <- vglm(symptoms ~ let, mustart = mymustart, cumulative(parallel = TRUE, reverse = TRUE), data = pneumo.long, trace = TRUE) coef(fit.long2, matrix = TRUE) # Should be as coef(fit, matrix = TRUE) # }
Run the code above in your browser using DataCamp Workspace