Different type of optimizer functions such as SGD, Momentum, AdamG and NAG.
optimizerMomentum(V, dW, W, alpha = 0.63, lr = 1e-4, lambda = 1)
return and updated W and other parameters such as V, V1 and V2 that will be used on SGD.
Momentum V = alpha*V - lr*(dW + lambda*W); W = W + V. NAG V = alpha*(V - lr*(dW + lambda*W); W = W + V - lr*(dW + lambda*W)
derivative of cost with respect to W, can be founde by dW = bwdNN2(dy, cache, model),
weights for DNN model, optimizerd by W = W + V
Momentum rate 0 < alpha < 1, default is alpah = 0.5.
learning rate, default is lr = 0.001.
regulation rate for cost + 0.5*lambda*||W||, default is lambda = 1.0.
Bingshu E. Chen
For SGD with momentum, use
V = 0; obj = optimizerMomentum(V, dW, W); V = obj$V; W = obj$W
For SDG with MAG
V = 0; obj = optimizerNAG(V, dW, W); V = obj$V; W = obj$W
activation,
bwdNN,
fwdNN,
dNNmodel,
dnnFit