mnlm: Estimation for high-dimensional Multinomial Logistic Regression

Description

MAP estimation of multinomial logistic regression models.

Usage

mnlm(counts, covars, normalize=FALSE, penalty=c(1,0.2), start=NULL, 
                  tol=0.1, tmax=1000, delta=1, dmin=0, bins=0, verb=FALSE)

Arguments

counts

A matrix of multinomial response counts in ncol(counts) categories for nrow(counts) individuals/observations. This can be a matrix, or a vector of response factors, but for most text-analysis applications

covars

A matrix of ncol(covars) covariate values for each of the nrow(counts) observations. This does not include the intercept, which is ALWAYS added in the design matrix.

normalize

Whether or not to normalize the covariate matrix to have mean zero and variance one.

penalty

Either a single fixed value, or a vector of length 2 giving the gamma hyperprior prior shape and rate parameters. Here, the penalty ($\lambda>0$) is a scale parameter for the Laplace prior on each non-intercept regression coefficient, parametrized

start

An initial guess for the full ncol(counts) by ncol(covars)+1 matrix of regression coefficients. Under the default start=NULL, the intercept is a logit transform of mean phrase frequencies and coef

tol

Optimization convergence tolerance for the improvement on the un-normalized negative log posterior over a single full parameter sweep.

tmax

The maximum number of optimization iterations.

delta

An initial step size for the least upper bound approximation to parameter information; implies a starting trust region of 2*delta.

dmin

Minimum trust region delta.

bins

For faster inference on large data sets (or just to collapse observations across levels for factor covariates), you can specify the number of bins for step-function approximations to the columns of covars. Counts a

verb

Control for print-statement output. TRUE prints some initial info and updates every iteration.

Value

An mnlm object list with entries
interceptThe intercept estimates for each phrase ($\alpha$).
loadingsThe intercept estimates for each phrase ($\phi$).
countssimple_triplet_matrix form of the counts input matrix
XIf bins>0, the binned counts matrix used for analysis.
covarsThe input covariates, possibly normalized.
VIf bins>0, the binned (and possibly normalized) covariate matrix used for analysis.
normalizedAn indicator for whether the covariates were normalized.
binnedAn indicator for whether the observations was binned.
covarMeanIf normalize=TRUE, the original covariate means. Otherwise empty.
covarSDIf normalize=TRUE, the original covariate standard deviations. Otherwise empty.
LThe unnormalized negative log posterior at each iteration.
residualsStandardized Pearson residuals, for only nonzero count entries. In simple triplet matrix format, with empty entries for zero count observations.
fittedFitted count expectations. With binary response, this is a vector of fitted probabilities. For binomial or multinomial response, it is a simple triplet matrix with empty entries for zero count observations.

Details

Finds the posterior mode for multinomial logistic regression parameters using cyclic coordinate descent. This is designed to be useful for inverse regression analysis of sentiment in text, where the multinomial response is quite large, but should be generally useful for any large-scale multinomial logistic regression. We allow for joint estimation of regression coefficients and Laplace regularization penalties. For response dimension greater than two, the regression coefficients are identified by augmenting each response vector with a null count of 1/1000 of each observation's total and assuming zero coefficients for this category. With binomial response, the first category is assumed null. Full details are available in Taddy (2011).

References

Taddy (2011), Inverse Regression for Analysis of Sentiment in Text. http://arxiv.org/abs/1012.2098

Examples

Run this code

## See congress109 and we8there for real multinomial data examples

## Bernoulli simulation; re-run to see sampling variability
n <- 100
v <- rnorm(n)
p <- (1+exp(-(v*2)))^{-1} 
y <- rbinom(n, size=1, prob=p)

## fit the logistic model
summary( fit <- mnlm(y, v) )
par(mfrow=c(1,2))
plot(fit)

## use predict to see fitted probabilities (could also just use fit$fitted)
phat <-  predict(fit, newdata=matrix(v,ncol=1))
plot(p, phat, pch=21, bg=c(2,4)[y+1], xlab="true probability", ylab="fitted probability")

Run the code above in your browser using DataLab