Description
stochastic gradient descent optimizer
Usage
SGD(momentum = 0.5, dampening = 0, weight_decay = 0, nesterov = TRUE)
Value
Anonymous function that returns optimizer when called.
Arguments
- momentum
strength of momentum
- dampening
decay
- weight_decay
l2 penalty on weights
- nesterov
Nesterov momentum or not