MNLAuxMix: Bayesian Multinomial Logit Regression Using Auxiliary Mixture Sampling

Description

This function provides Bayesian multinomial logit regression using auxiliary mixture sampling. See Fruehwirth-Schnatter and Fruehwirth (2010). That is an MCMC sampler that is also used for the mixtures-of-experts extension of Dirichlet Multinomial (dmClustExtended) and Markov chain clustering (mcClustExtended). It requires four mandatory arguments: Data, Prior, Initial and Mcmc; each representing a list of (mandatory) arguments: Data contains data information, Prior contains prior information, Initial contains information about starting conditions (initial values) and Mcmc contains the setup for the MCMC sampler.

Usage

MNLAuxMix( 
    Data = list( storeDir = "try01", 
                 X = stop("X (matrix of covariates) must be specified")), 
    Prior = list( H = 4, betaPrior = "informative", 
                  betaPriorMean = 0, betaPriorVar = 1), 
    Initial = list( S.i.start = rep(1:H, N), Beta.start = NULL), 
    Mcmc = list( M = 50, M0 = 20, mOut = 5, mSave = 10, seed = 12345))

Arguments

Value

A list containing (/the output file contains):workspaceFileA character indicating the name of and the path (based on the currend working directory) to the output file, wherein all the results are saved. The name of the output file starts with mnLogit_newAux_ respectively followed by the number of groups H, the number of iterations M and the particular point in time when the function was called, with format: yyyymmdd_hhmmss. E.g. mnLogit_newAux_H4_M10000_20110218_045254.RData.DataThe argument Data.PriorThe argument Prior.InitialThe argument Initial.McmcThe argument Mcmc.Beta.mA 3-dimensional array of dimension $\code{ncol(X)} \times H \times M$ containing the draws for the regression coefficients $\beta_h$ in each $m$-th iteration step.bk0The prior parameters for the mean vectors of the normal (prior) distributions of the regression coefficients.Bk0invThe prior parameters for the inverse variance-covariance matrices of the normal (prior) distributions of the regression coefficients.fileNameA character value indicating the name of the output file (see also workspaceFile).NAn integer indicating $N$, the number of individuals/units/objects.totalTimeA numeric value indicating the total time (in secs) used for the function call.bkNThe posterior parameters (in the last iteration step) for the mean vectors of the normal (posterior) distributions from which the regression coefficients were drawn.BkNThe posterior parameters (in the last iteration step) for the variance-covariance matrices of the normal (posterior) distributions from which the regression coefficients were drawn.logLikeA vector containing the values of the log-likelihood calculated in each iteration step.

Reporting Progress (Log Protocol)

The log protocol keeps record of the progress of the estimation procedure and is shown on the screen. At first the name of the workspace file is documented. Then for each mOut-th iteration step (at least for $m=1, \ldots, 5, 10, 20, 50, 100, 200, 500$) information about the elapsed time and the expected time to the end is reported. Finally the total time is shown. For example: workspaceFile: MNLAuxMix-Example-Outfiles\mnLogit_newAux_H2_M100_20111129_083023.RData (within current working directory!) m = 1 ; duration of iter proc so far: 0.25 sec. m = 2 ; duration of iter proc so far: 0.33 sec., exp time to end: 0.54 min. m = 3 ; duration of iter proc so far: 0.44 sec., exp time to end: 0.36 min. m = 4 ; duration of iter proc so far: 0.52 sec., exp time to end: 0.28 min. m = 5 ; duration of iter proc so far: 0.6 sec., exp time to end: 0.24 min. m = 10 ; duration of iter proc so far: 1.04 sec., exp time to end: 0.18 min. m = 20 ; duration of iter proc so far: 1.93 sec., exp time to end: 0.14 min. m = 30 ; duration of iter proc so far: 2.8 sec., exp time to end: 0.11 min. m = 40 ; duration of iter proc so far: 3.79 sec., exp time to end: 0.1 min. m = 50 ; duration of iter proc so far: 4.79 sec., exp time to end: 0.08 min. m = 60 ; duration of iter proc so far: 5.89 sec., exp time to end: 0.07 min. m = 70 ; duration of iter proc so far: 6.8 sec., exp time to end: 0.05 min. m = 80 ; duration of iter proc so far: 7.68 sec., exp time to end: 0.03 min. m = 90 ; duration of iter proc so far: 8.63 sec., exp time to end: 0.02 min. m = 100 ; duration of iter proc so far: 9.52 sec., exp time to end: 0 min. Total time: 0 hours 0 min

Warning

Note that there are no such things as default values (see Section Arguments)!

Details

Note that the values of the arguments indicated here have nothing to do with default values! For a call of these functions this lists-of-arguments structure requires a complete specification of all arguments! The following arguments which are lists have to be completely provided (note that there are no such things as default values within lists!): Data contains: [object Object],[object Object] Prior contains (see also Section Prior Data): [object Object],[object Object],[object Object] Initial contains: [object Object],[object Object] Mcmc contains: [object Object],[object Object],[object Object],[object Object],[object Object]

References

Sylvia Fruehwirth-Schnatter, Christoph Pamminger, Andrea Weber and Rudolf Winter-Ebmer, (2011), "Labor market entry and earnings dynamics: Bayesian inference using mixtures-of-experts Markov chain clustering". Journal of Applied Econometrics. DOI: 10.1002/jae.1249 http://onlinelibrary.wiley.com/doi/10.1002/jae.1249/abstract Christoph Pamminger and Sylvia Fruehwirth-Schnatter, (2010), "Model-based Clustering of Categorical Time Series". Bayesian Analysis, Vol. 5, No. 2, pp. 345-368. DOI: 10.1214/10-BA606 http://ba.stat.cmu.edu/journal/2010/vol05/issue02/pamminger.pdf Sylvia Fruehwirth-Schnatter and Rudolf Fruehwirth, (2010), "Data augmentation and MCMC for binary and multinomial logit models". In T. Kneib and G. Tutz (eds): Statistical Modelling and Regression Structures: Festschrift in Honour of Ludwig Fahrmeir. Physica Verlag, Heidelberg, pp. 111-132. DOI: 10.1007/978-3-7908-2413-1_7 http://www.springerlink.com/content/t4h810017645wh68/. See also: IFAS Research Paper Series 2010-48 (http://www.jku.at/ifas/content/e108280/e108491/e108471/e109880/ifas_rp48.pdf).

Examples

Run this code

#rm(list=ls(all=TRUE))

# ==================================================================================
if ( FALSE ) {
# ==================================================================================

# set working directory
oldDir <- getwd()
curDir <- tempdir()
setwd(curDir)

if ( !file.exists("bayesMCClust-wd") ) dir.create("bayesMCClust-wd")
setwd("bayesMCClust-wd") 
myOutfilesDir <- "MNLAuxMix-Example-Outfiles"      

data(MCCExtExampleData)

if (!is.element("MCCExtExampleData$covariates", search())) {
    attach(MCCExtExampleData$covariates)
}

# ==================================================================================

response <- MCCExtExampleData[[ sample(5:7, 1) ]] # MCCExtExampleData$MNLresponse2gr
# MCCExtExampleData$MNLresponse3gr # MCCExtExampleData$MNLresponse4gr # 

groupNr <- max(response) # 3

# ==================================================================================
# ==================================================================================

require(nnet, quietly = TRUE)
H <- groupNr
X = cbind( intercept=1, alrateBezNew, unskilled, skilled, angStart ) 

N <- dim(X)[1]
mX <- data.frame( cbind(group=as.factor( response ), X[,-1], 
                  matrix(sample(1:H,H*N,replace=TRUE),N,H)) )

colnames(mX)[6:(6+groupNr-1)] <- c(  "as.1", "as.2", "as.3", "as.4" )[1:groupNr] 

tempMNom <- multinom(group ~ alrateBezNew+ unskilled+ skilled+ angStart, 
                     data=as.data.frame(mX)) 

toStartBeta <- t(rbind(0,coef( tempMNom )))

# ==================================================================================
system.time(
  outList <- MNLAuxMix( 
    Data = list( storeDir = myOutfilesDir, 
                 # will be created if not existing (in current working directory!)
                 X = cbind( intercept=1, alrateBezNew, unskilled, skilled, angStart ) ),                         
    Prior = list( H = groupNr, # number of alternatives 1,...,H
                  betaPrior = "informative",  
                  # 'uninformative' (improper) prior pars for beta (betaPriorVar = infty)
                  betaPriorMean = 0,
                  betaPriorVar = 1), # 'informative' prior pars for beta -> N(0,1)
    Initial = list( S.i.start = response, #  vector of multinomial outcomes / choice made
                    Beta.start = toStartBeta ),  
    Mcmc = list( M = 100, 
                 M0 = 50, 
                 mOut = 10, 
                 mSave = 50, 
                 seed = sample(1:100000, 1) # 6984684
    )
  )
)

str(outList)

#outFileName <- outList$workspaceFile
#outFileName
#results <- load(outFileName)
#results

allocList <- calcAllocationsMNL(outList, thin=1, maxi=50) 
str(allocList)

myRegCoeffs <- calcRegCoeffs(outList, hBase=2, thin=1) 
#, M0=Mcmc$M0, grLabels=paste("Group", 1:Prior$H), 
# printHPD=TRUE, plotPaths=TRUE, plotACFs=TRUE
str(myRegCoeffs)

mySegPower <- calcSegmentationPower(outList, classProbs=allocList$classProbs, 
    class=allocList$class, printXtable=TRUE, calcSharp=TRUE, printSharpXtable=TRUE ) 
    # , grLabels=paste("Group", 1:Prior$H)
str(mySegPower)

myEntropy <- calcEntropy(outList, classProbs=allocList$classProbs, 
    class=allocList$class, printXtable=TRUE ) 
    # , grLabels=paste("Group", 1:Prior$H)
myEntropy

plotLikeliPaths(outList, from=10, by=1 )

myNumEffTables <- calcNumEff( outList, thin=1, printXi=TRUE, printE=TRUE, 
    printBeta=TRUE, grLabels=paste("Group", 1:outList$Prior$H) ) 
str(myNumEffTables)

setwd(oldDir)

# ==================================================================================

if ( is.element("MCCExtExampleData$covariates", search())) { 
    detach(MCCExtExampleData$covariates)
}

# ==================================================================================
} # end if
# ==================================================================================

# ==================================================================================

Run the code above in your browser using DataLab