Root solver by Stochastic Approximation
optim_sa(
f,
init = 0,
function.args = list(),
...,
method = c("standard", "discrete", "interpolate"),
projection = identity,
control = list(niter = 100, alpha = 0.25, stepmult = 1),
verbose = TRUE,
burn.in = 0.75
)Stochastic function
Initial value to evaluate f in
Additional arguments passed to f
Additional arguments passed to f
Method ('standard': standard Robbins-Monro, 'discrete' or 'interpolate' for integer stochastic optimization. See details section)
Optional projection function that can be used to constrain the parameter value (e.g., function(x) max(x, tau)). Applied at the end of each iteration of the optimization.
Control arguments (niter number of iterations, alpha and
stepmult control the step-length of the Polyak-Ruppert algorithm as
described in the details section).
if TRUE additional messages are printed throughout the optimization
Burn-in (fraction of initial values to disregard) when
applying Polyak–Ruppert averaging (alpha<1). Alternatively, burn.in
can be given as an integer defining the absolut number of iterations to
disregard.
The aim is to find the root of the function \( M(\theta) = E f(\theta)\), where we are only able to observe the stochastic function \(f(\theta)\). We will assume that $M$ is non-decreasing with a unique root \(\theta^\ast\). The Robbins-Monro algorithm works through the following update rule from an initial start value \(\theta_0\): $$\theta_{n+1} = \theta_{n} - a_n f(\theta_n)$$ where \(\sum_{n=0}^\infty a_n = \infty\) and \(\sum_{n=0}^\infty a_n^2 < \infty\).
By averaging the iterates $$\overline{\theta}_n =
\frac{1}{n}\sum_{i=1}^{n-1}\theta_i$$ we can get improved stability of the
algorithm that is less sensitive to the choice of the step-length \(a_n\).
This is known as the Polyak-Ruppert algorithm and to ensure convergence
longer step sizes must be made which can be guaranteed by using \(a_n =
Kn^{-\alpha}\) with \(0<\alpha<1\). The parameters \(K\) and \(\alpha\)
are controlled by the stepmult and alpha parameters of the
control argument.
For discrete problems where \(\theta\) must be an integer, we follow (Dupac & Herkenrath, 1984). Let \(\lfloor x\rfloor\) denote the integer part of \(x\), and define either (method="discrete"): $$g(\theta) = I(U<\theta-\lfloor\theta\rfloor)f(\lfloor\theta\rfloor) + I(U\geq\theta-\lfloor\theta\rfloor)f(\lfloor\theta\rfloor+1)$$ where \(U\sim Uniform([0,1])\), or (method="interpolate"): $$g(\theta) = (1-\theta+\lfloor\theta\rfloor)f(\lfloor\theta\rfloor) + (\theta-\lfloor\theta\rfloor)f(\lfloor\theta\rfloor+1).$$ The stochastic approximation method is then applied directly on \(g\).
Dupač, V., & Herkenrath, U. (1984). On integer stochastic approximation. Aplikace matematiky, 29(5), 372-383.