mlogit.optim(logLik, start, method = c("bfgs", "nr", "bhhh"), iterlim = 2000,
tol = 1E-06, ftol = 1e-08, steptol = 1e-10,
print.level = 0, constPar = NULL, ...)
'nr'
for Newton-Ralphson,
'bhhh'
for Berndt-Hausman-Hall-Hall and 'bfgs'
,'print.level=0'
, no information about the
optimization process is provided, if 'print.level=1'
the
value of the likelihood, the step and the stoping criteria f
.gradi
a matrix that contains the contribution of
each individual to the gradient, gradient
the gradient and, if
method='nr'
hessian
the hessian,'nb.iter'
the number of iterations, 'eps'
the value of the stoping criteria, 'method'
the method of
optimization method used, 'message'
method='nr'
, H is the hessian (i.e. is the second derivates
matrix of the likelihood function),
if method = 'bhhh'
, H is the outer-product of the individual
contributions of each individual to the gradient, if method = 'bfgs'
, H^-1 is updated at each iteration using a
formula that uses the variations of the vector of parameters and the
gradient. The initial value of the matrix is the inverse of the
outer-product of the gradient (i.e. the bhh estimator of the
hessian).
The initial step is 1 and, if the new value of the function is less
than the previous value, it is divided by two, until a higher value is
obtained.
The routine stops when the gradient is sufficiently close to 0. The
criteria is g * H^-1 * g which is compared to the tol
argument. It also may stops if the number of iterations equals
iterlim
.
The function f
has a initial.value
argument which is the
initial value of the likelihood. The function is then evaluated a
first time with a step equals to one. If the value is lower than the
initial value, the step is divided by two until the likelihood
increases. The gradient is then computed and the function returns as
attributes the gradient is the step. This method is more efficient
than other functions available for R :
For the optim
and the maxLik
functions, the function and
the gradient should be provided as separate functions. But, for
multinomial logit models, both depends on the probabilities which are
the most time-consuming elements of the model to compute.
For the nlm
function, the fonction returns the gradient as an
attribute. The gradient is therefore computed at each iteration, even
when the function is computed with a step that is unable to increase
the value of the likelihood.
Previous versions of mlogit
depended on the 'maxLik'
package. We kept the same interface, namely the start
,
method
, iterlim
, tol
, print.level
and
constPar
arguments.
The default method is 'bfgs'
, which is known to perform well,
even if the likelihood function is not well behaved and the default
value for print.level=1
, which means moderate printing.
A special default behavior is performed if a simple multinomial logit
model is estimated. Indeed, for this model, the likelihood function is
concave, the analytical hessian is simple to write and the
optimization is straightforward. Therefore, in this case, the default
method is 'nr'
and print.level=0
.