Select one of the following optimizers: "SDG", "RMSPROP", "ADAGRAD", "ADADELTA", "ADAM", "ADAMAX", "NADAM".
selectKerasOptimizer(
optimizer,
learning_rate = 0.01,
momentum = 0,
decay = 0,
nesterov = FALSE,
clipnorm = NULL,
clipvalue = NULL,
rho = 0.9,
epsilon = NULL,
beta_1 = 0.9,
beta_2 = 0.999,
amsgrad = FALSE,
...
)
Optimizer for use with compile.keras.engine.training.Model
.
integer specifying the algorithm. Can be one of the following:
1=SDG
, 2=RMSPROP
, 3=ADAGRAD
, 4=ADADELTA
,
5=ADAM
, 6=ADAMAX
, or 7=NADAM
.
## SGD:
float >= 0. Learning rate.
float >= 0. Parameter that accelerates SGD in the relevant direction and dampens oscillations.
float >= 0. Learning rate decay over each update.
boolean. Whether to apply Nesterov momentum.
Gradients will be clipped when their L2 norm exceeds this value.
Gradients will be clipped when their absolute value exceeds this value.
### RMS:
float >= 0. Decay factor.
float >= 0. Fuzz factor. If `NULL`, defaults to `k_epsilon()`.
### ADAM:
The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1.
The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1.
Whether to apply the AMSGrad variant of this algorithm from the paper "On the Convergence of Adam and Beyond".
Unused, present only for backwards compatability