mxComputeGradientDescent: Optimize parameters using a gradient descent optimizer

Description

This optimizer does not require analytic derivatives of the fit function. The open-source version of OpenMx offers 2 choices, SLSQP (from the NLOPT collection) and CSOLNP. The proprietary version of OpenMx offers the choice of three optimizers: SLSQP, CSOLNP, and NPSOL.

Usage

mxComputeGradientDescent(freeSet = NA_character_, ..., engine = NULL,
  fitfunction = "fitfunction", verbose = 0L, tolerance = NA_real_,
  useGradient = NULL, warmStart = NULL, nudgeZeroStarts = TRUE,
  maxMajorIter = NULL, gradientAlgo = mxOption(NULL, "Gradient algorithm"),
  gradientIterations = mxOption(NULL, "Gradient iterations"),
  gradientStepSize = 1e-05)

Arguments

freeSet

names of matrices containing free variables

...

Not used. Forces remaining arguments to be specified by name.

engine

specific 'NPSOL', 'SLSQP', or 'CSOLNP'

fitfunction

name of the fitfunction (defaults to 'fitfunction')

verbose

level of debugging output

tolerance

how close to the optimum is close enough (also known as the optimality tolerance)

useGradient

whether to use the analytic gradient (if available)

warmStart

a Cholesky factored Hessian to use as the NPSOL Hessian starting value (preconditioner)

nudgeZeroStarts

whether to nudge any zero starting values prior to optimization (default TRUE)

maxMajorIter

maximum number of major iterations

gradientAlgo

one of c('forward','central')

gradientIterations

number of Richardson iterations to use for the gradient (default 2)

gradientStepSize

the step size for the gradient (default 1e-5)

Details

One of the most important options for SLSQP is gradientAlgo. By default, the forward method is used. This method requires gradientIterations function evaluations per parameter per gradient. This method often works well enough but can result in imprecise gradient estimations that may not allow SLSQP to fully optimize a given model. If code red is reported then you are encouraged to try the central method. The central method requires 2 times gradientIterations function evaluations per parameter per gradient, but it can be much more accurate.

References

Luenberger, D. G. & Ye, Y. (2008). Linear and nonlinear programming. Springer.

Examples

Run this code

data(demoOneFactor)
factorModel <- mxModel(name ="One Factor",
  mxMatrix(type="Full", nrow=5, ncol=1, free=FALSE, values=0.2, name="A"),
    mxMatrix(type="Symm", nrow=1, ncol=1, free=FALSE, values=1, name="L"),
    mxMatrix(type="Diag", nrow=5, ncol=5, free=TRUE, values=1, name="U"),
    mxAlgebra(expression=A %*% L %*% t(A) + U, name="R"),
  mxExpectationNormal(covariance="R", dimnames=names(demoOneFactor)),
  mxFitFunctionML(),
    mxData(observed=cov(demoOneFactor), type="cov", numObs=500),
     mxComputeSequence(steps=list(
     mxComputeGradientDescent(),
     mxComputeNumericDeriv(),
     mxComputeStandardError(),
     mxComputeHessianQuality()
    )))
factorModelFit <- mxRun(factorModel)
factorModelFit$output$conditionNumber # 29.5

Run the code above in your browser using DataLab