multinomial: Multinomial Logit Model

Description

Fits a multinomial logit model to a (preferably unordered) factor response.

Usage

multinomial(zero = NULL, parallel = FALSE, nointercept = NULL,
            refLevel = "last", whitespace = FALSE)

Arguments

zero

An integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. Any values must be from the set {1,2,...,$M$}. The default value means none are modelled as intercept-only terms. See

parallel

A logical, or formula specifying which terms have equal/unequal coefficients.

nointercept, whitespace

See CommonVGAMffArguments for more details.

refLevel

Either a single positive integer or a value of the factor. If an integer then it specifies which column of the response matrix is the reference or baseline level. The default is the last one (the $(M+1)$th one). If used, this argument will be ofte

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.

Warning

No check is made to verify that the response is nominal.

See CommonVGAMffArguments for more warnings.

Details

In this help file the response $Y$ is assumed to be a factor with unordered values $1,2,\dots,M+1$, so that $M$ is the number of linear/additive predictors $\eta_j$.

The default model can be written $$\eta_j = \log(P[Y=j]/ P[Y=M+1])$$ where $\eta_j$ is the $j$th linear/additive predictor. Here, $j=1,\ldots,M$, and $\eta_{M+1}$ is 0 by definition. That is, the last level of the factor, or last column of the response matrix, is taken as the reference level or baseline---this is for identifiability of the parameters. The reference or baseline level can be changed with the refLevel argument.

In almost all the literature, the constraint matrices associated with this family of models are known. For example, setting parallel = TRUE will make all constraint matrices (except for the intercept) equal to a vector of $M$ 1's. If the constraint matrices are unknown and to be estimated, then this can be achieved by fitting the model as a reduced-rank vector generalized linear model (RR-VGLM; see rrvglm). In particular, a multinomial logit model with unknown constraint matrices is known as a stereotype model (Anderson, 1984), and can be fitted with rrvglm.

References

Yee, T. W. (2010) The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1--34. http://www.jstatsoft.org/v32/i10/.

Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15--41.

McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.

Agresti, A. (2002) Categorical Data Analysis, 2nd ed. New York, USA: Wiley.

Hastie, T. J., Tibshirani, R. J. and Friedman, J. H. (2009) The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed. New York, USA: Springer-Verlag.

Simonoff, J. S. (2003) Analyzing Categorical Data, New York, USA: Springer-Verlag.

Anderson, J. A. (1984) Regression and ordered categorical variables. Journal of the Royal Statistical Society, Series B, Methodological, 46, 1--30.

Tutz, G. (2012) Regression for Categorical Data, Cambridge University Press.

Examples

Run this code

# Example 1: fit a multinomial logit model to Edgar Anderson's iris data
data(iris)
fit <- vglm(Species ~ ., multinomial, iris)
coef(fit, matrix = TRUE)

# Example 2a: a simple example 
ycounts <- t(rmultinom(10, size = 20, prob = c(0.1, 0.2, 0.8)))  # Counts
fit <- vglm(ycounts ~ 1, multinomial)
head(fitted(fit))   # Proportions
fit@prior.weights   # NOT recommended for extraction of prior weights
weights(fit, type = "prior", matrix = FALSE)  # The better method
depvar(fit)         # Sample proportions; same as fit@y
constraints(fit)    # Constraint matrices

# Example 2b: Different reference level used as the baseline 
fit2 <- vglm(ycounts ~ 1, multinomial(refLevel = 2))
coef(fit2, matrix = TRUE)
coef(fit , matrix = TRUE)  # Easy to reconcile this output with fit2

# Example 3: The response is a factor.
nn <- 10
dframe3 <- data.frame(yfactor = gl(3, nn, labels = c("Control", "Trt1", "Trt2")),
                      x2 = runif(3 * nn))
myrefLevel <- with(dframe3, yfactor[12])
fit3a <- vglm(yfactor ~ x2, multinomial(refLevel = myrefLevel), dframe3)
fit3b <- vglm(yfactor ~ x2, multinomial(refLevel = 2), dframe3)
coef(fit3a, matrix = TRUE)  # "Treatment1" is the reference level
coef(fit3b, matrix = TRUE)  # "Treatment1" is the reference level
margeff(fit3b)

# Example 4: Fit a rank-1 stereotype model 
fit4 <- rrvglm(Country ~ Width + Height + HP, multinomial, data = car.all)
coef(fit4)  # Contains the C matrix
constraints(fit4)$HP       # The A matrix 
coef(fit4, matrix = TRUE)  # The B matrix
Coef(fit4)@C               # The C matrix 
concoef(fit4)              # Better to get the C matrix this way
Coef(fit4)@A               # The A matrix 
svd(coef(fit4, matrix = TRUE)[-1, ])$d  # This has rank 1; = C %*% t(A) 
# Classification (but watch out for NAs in some of the variables):
apply(fitted(fit4), 1, which.max)  # Classification
colnames(fitted(fit4))[apply(fitted(fit4), 1, which.max)]  # Classification
apply(predict(fit4, car.all, type = "response"), 1, which.max)  # Ditto


# Example 5: The use of the xij argument (aka conditional logit model)
set.seed(111)
nn <- 100  # Number of people who travel to work
M <- 3  # There are M+1 models of transport to go to work
ycounts <- matrix(0, nn, M+1)
ycounts[cbind(1:nn, sample(x = M+1, size = nn, replace = TRUE))] = 1
dimnames(ycounts) <- list(NULL, c("bus","train","car","walk"))
gotowork <- data.frame(cost.bus  = runif(nn), time.bus  = runif(nn), 
                       cost.train= runif(nn), time.train= runif(nn),
                       cost.car  = runif(nn), time.car  = runif(nn),
                       cost.walk = runif(nn), time.walk = runif(nn))
gotowork <- round(gotowork, digits = 2)  # For convenience
gotowork <- transform(gotowork,
                      Cost.bus   = cost.bus   - cost.walk,
                      Cost.car   = cost.car   - cost.walk,
                      Cost.train = cost.train - cost.walk,
                      Cost       = cost.train - cost.walk,  # for labelling
                      Time.bus   = time.bus   - time.walk,
                      Time.car   = time.car   - time.walk,
                      Time.train = time.train - time.walk,
                      Time       = time.train - time.walk)  # for labelling
fit <- vglm(ycounts ~ Cost + Time,
            multinomial(parall = TRUE ~ Cost + Time - 1),
            xij = list(Cost ~ Cost.bus + Cost.train + Cost.car,
                       Time ~ Time.bus + Time.train + Time.car),
            form2 =  ~ Cost + Cost.bus + Cost.train + Cost.car +
                       Time + Time.bus + Time.train + Time.car,
            data = gotowork, trace = TRUE)
head(model.matrix(fit, type = "lm"))   # LM model matrix
head(model.matrix(fit, type = "vlm"))  # Big VLM model matrix
coef(fit)
coef(fit, matrix = TRUE)
constraints(fit)
summary(fit)
max(abs(predict(fit) - predict(fit, new = gotowork)))  # Should be 0

Run the code above in your browser using DataLab