mnlm: Estimation for high-dimensional Multinomial Logistic Regression

Description

MAP estimation of Multinomial logistic regression models.

Usage

mnlm(counts, covars, normalize=FALSE, lambda=NULL, start=NULL,
                  tol=0.1, tmax=1000, delta=1, dmin=0, bins=0, verb=TRUE)

Arguments

counts

A matrix of multinomial response counts in ncol(counts) categories for nrow(counts) individuals/observations. This can be a standard matrix, but for most text-analysis applications should be a simple

covars

A matrix of ncol(covars) covariate values for each of the nrow(counts) observations. This does not include the intercept, which is ALWAYS added in the design matrix.

normalize

Whether or not to normalize the covariate matrix to have mean zero and variance one.

lambda

Either a single fixed value, or a vector of length 2 giving the gamma hyperprior prior shape and rate parameters (e.g., c(s=2, r=2) ). Here, lambda (>0) is a joint scale parameter for the Laplace prior on each non-intercept regressi

start

An initial guess for the full ncol(counts) by ncol(covars)+1 matrix of regression coefficients. Under the default start=NULL, the intercept is a logit transform of mean phrase frequencies and coef

tol

Optimization convergence tolerance for the improvement on the un-normalized negative log posterior over a single full parameter sweep.

tmax

The maximum number of optimization iterations.

delta

An initial step size for the least upper bound approximation to parameter information; implies a starting trust region of 2*delta.

dmin

Minimum trust region delta.

bins

For faster inference on large data sets (or just to collapse observations across levels for factor covariates), you can specify the number of bins for step-function approximations to the columns of covars. Counts a

verb

Level of print-statement output. TRUE prints some initial info and updates every iteration.

Value

An mnlm object list with entries
interceptThe intercept estimates for each phrase ($\alpha$).
loadingsThe intercept estimates for each phrase ($\phi$).
countssimple_triplet_matrix form of the counts input matrix
XThe design matrix used for analysis; includes an added null column and may have merged observations from counts.
covarsThe input covariates, possibly normalized.
VThe covariate matrix used for analysis; possibly normalized or binned, and including the intercept
covarMeanIf normalize=TRUE, the original covariate means. Otherwise empty.
covarSDIf normalize=TRUE, the original covariate standard deviations. Otherwise empty.
maplamAn indicator for whether the regularization penalty was estimated.
lamparParameters (init, shape, rate) for the regularization penalty.
lambdaThe path of lambda estimates.
deltaThe trust region deltas upon convergence.
LThe unnormalized negative log posterior at each iteration.
niterThe number of iterations
tol,tmaxConvergence parameters, unchanged from input.
startThe initial coefficient estimates.

Details

Finds the posterior mode for multinomial logistic regression parameters using cyclic coordinate descent. This is designed to be useful for inverse regression analysis of sentiment in text, where the multinomial response is quite large, but should be generally useful for any large-scale multinomial logistic regression. We allow for joint estimation of regression coefficients and a Laplace regularization penalty. Regression coefficients are identified by augmenting each response vector with a null count of 0.01 and assuming zero coefficients for this category. Full details are available in Taddy (2011).

References

Taddy (2011), Inverse Regression for Analysis of Sentiment in Text. http://arxiv.org/abs/1012.2098

Examples

Run this code

## See congress109 and we8there for real data examples

## Binomial simulation; re-run to see sampling variability
n <- 20
size <- 10
v <- rnorm(n)
p <- (1+exp(-(1 + v*2)))^{-1} 
y <- rbinom(n, size=size, prob=p)
counts <- cbind(size-y, y)

## fit the logistic model
fit <- mnlm(counts, v)

## extract fitted probabilities
eta <- fit$intercept + fit$loadings%*%t(v)
q0 <- 1/(1+colSums(exp(eta))) # null category
phat <- t(exp(eta))*( q0/(1-q0) )
plot(p, phat[,2], pch=21, bg=rainbow(n), 
	xlab="true", ylab="fitted", main="binomial probability")

Run the code above in your browser using DataLab