trust.optim: Nonlinear optimizers using trust regions

Description

Run nonlinear minimizer using trust region algorithm with conjugate gradient search directions and quasi-Hessian updates

Usage

trust.optim(x, fn, gr, hs=NULL, method=c("SR1","BFGS","Sparse"),
	    control=list(), ...)

Arguments

a numeric vector of starting values for the optimizer.

an R function that takes x as its first argument. Returns the value of the objective function at x. Note that the optimizer will minimize fn (see function.scale.factor un

an R function that takes x as its first argument. Returns a numeric vector that is the gradient of fn at x. Naturally, the length of the gradient must be the same as the length of x. The

an R function that takes x as its first argument. Returns a Hessian matrix object of class "dgCMatrix" (see the Matrix package). This function is called only if the selected method is "Sparse."

method

Valid arguments are "SR1","BFGS",and "Sparse".

control

A list containing control parameters for the optimizer. See details.

...

Additonal arguments passed to fn, gr and hs. All arguments must be named.

Value

fvalValue of the objective function
solutionParameter vector at the optimum
gradientGradient at the optimum
hessianEstimate of the Hessian at the optimum (as class "symmetricMatrix", returned only for Sparse method).
iterationsNumber of iterations before stopping
statusA message describing the last state of the iterator

Stopping criteria

The algorithm will stop when one of the following three conditions are met:

The norm of the gradient, divided by the square root of the number of parameters, is less thanprec.

The trust region collapse to a radius smaller than machine precision The algorithm proposes zero or negative improvement in the objective function (should never happen) The number of iterations reaches the control parameter maxit

Estimating a sparse Hessian

Sometimes estimating the Hessian is easy (e.g., you have an analytic representation, or you are using some kind of algorithmic differentiation software). If you do not know the Hessian, but you do know the sparsity structure, try the sparseHessianFD package. The routines in sparseHessianFD compute the Hessian using finite differencing, but in a way that exploits the sparsity structure. In many cases, this can be faster than constructing an analytic Hessian for a large problem (e.g., when the Hessian has a block-arrow structure with a large number of blocks).

To use the sparseHessian package, you need to provide the row and column indices of the non-zero elements of the lower triangle of the Hessian. This structure cannot change during the course of the trust.optim routine. Also, you really should provide an analytic gradient. sparseHessianFD computes finite differences of the gradient, so if the gradient itself is finite-differenced, so much error is propogated through that the Hessians are nearly worthless close to the optimum.

Of course, sparseHessianFD is useful only for the Sparse method. That said, one may still get decent performance using these routines even if the Hessian is sparse, if the problem is not too large. Just treat the Hessian as if it were sparse.