Adagrad optimizer as described in Adaptive Subgradient Methods for OnlineLearning and StochasticOptimization.
optimizer_adagrad(lr = 0.01, epsilon = 1e-08, decay = 0,
  clipnorm = NULL, clipvalue = NULL)float >= 0. Learning rate.
float >= 0. Fuzz factor.
float >= 0. Learning rate decay over each update.
Gradients will be clipped when their L2 norm exceeds this value.
Gradients will be clipped when their absolute value exceeds this value.
Other optimizers: optimizer_adadelta,
  optimizer_adamax,
  optimizer_adam,
  optimizer_nadam,
  optimizer_rmsprop,
  optimizer_sgd