mlogit.optim: Non-linear minimization routine

Description

This function performs efficiently the optimization of the likelihood functions for multinomial logit models

Usage

mlogit.optim(logLik, start, method = c("bfgs", "nr", "bhhh"), iterlim = 2000,
             tol = 1E-06, ftol = 1e-08, steptol = 1e-10,
             print.level = 0, constPar = NULL, ...)

Arguments

logLik

the likelihood function to be maximized,

start

the initial value of the vector of coefficients,

method

the method used, one of 'nr' for Newton-Ralphson, 'bhhh' for Berndt-Hausman-Hall-Hall and 'bfgs',

iterlim

the maximum number of iterations,

tol

the value of the criteria for the gradient,

ftol

the value of the criteria for the function,

steptol

the value of the criteria for the step,

print.level

one of (0, 1, 2), the details of the printing messages. If 'print.level=0', no information about the optimization process is provided, if 'print.level=1' the value of the likelihood, the step and the stoping criteria

constPar

a numeric or a character vector which indicates that some parameters should be treated as constant,

...

further arguments passed to f.

Value

a list that contains the followings elements :
optimumthe value of the function at the optimum, with attributes: gradi a matrix that contains the contribution of each individual to the gradient, gradient the gradient and, if method='nr' hessian the hessian,
coefficientsthe vector of the parameters at the optimum,
est.stata list that contains some information about the optimization : 'nb.iter' the number of iterations, 'eps' the value of the stoping criteria, 'method' the method of optimization method used, 'message'

Details

The optimization is performed by updating, at each iteration, the vector of parameters by the amount step * direction, where step is a positive scalar and direction = H^-1 * g, where g is the gradient and H^-1 is an estimation of the inverse of the hessian. The choice of H^-1 depends on the method chosen : if method='nr', H is the hessian (i.e. is the second derivates matrix of the likelihood function), if method = 'bhhh', H is the outer-product of the individual contributions of each individual to the gradient,

if method = 'bfgs', H^-1 is updated at each iteration using a formula that uses the variations of the vector of parameters and the gradient. The initial value of the matrix is the inverse of the outer-product of the gradient (i.e. the bhh estimator of the hessian). The initial step is 1 and, if the new value of the function is less than the previous value, it is divided by two, until a higher value is obtained. The routine stops when the gradient is sufficiently close to 0. The criteria is g * H^-1 * g which is compared to the tol argument. It also may stops if the number of iterations equals iterlim. The function f has a initial.value argument which is the initial value of the likelihood. The function is then evaluated a first time with a step equals to one. If the value is lower than the initial value, the step is divided by two until the likelihood increases. The gradient is then computed and the function returns as attributes the gradient is the step. This method is more efficient than other functions available for R :

For the optim and the maxLik functions, the function and the gradient should be provided as separate functions. But, for multinomial logit models, both depends on the probabilities which are the most time-consuming elements of the model to compute.

For the nlm function, the fonction returns the gradient as an attribute. The gradient is therefore computed at each iteration, even when the function is computed with a step that is unable to increase the value of the likelihood.

Previous versions of mlogit depended on the 'maxLik' package. We kept the same interface, namely the start, method, iterlim, tol, print.level and constPar arguments.

The default method is 'bfgs', which is known to perform well, even if the likelihood function is not well behaved and the default value for print.level=1, which means moderate printing. A special default behavior is performed if a simple multinomial logit model is estimated. Indeed, for this model, the likelihood function is concave, the analytical hessian is simple to write and the optimization is straightforward. Therefore, in this case, the default method is 'nr' and print.level=0.