This function obtains the minimum-norm subgradient of the approximated square error with L1 norm penalty or L2 norm penalty.
subgradient(w, X, y, nHidden, lambda, lambda2)
(numeric, \(n\)) weights and biases.
(numeric, \(n \times p\)) incidence matrix.
(numeric, \(n\)) the response data-vector.
(positive integer, \(1\times h\)) matrix, h indicates the number of hidden-layers and nHidden[1,h] indicates the neurons of the h-th hidden-layer.
(numeric,\(n\)) lagrange multiplier for L1 norm penalty on parameters.
(numeric,\(n\)) lagrange multiplier for L2 norm penalty on parameters.
A vector with the subgradient values.
It is based on choosing a subgradient with minimum norm as a steepest descent direction and taking a step resembling Newton iteration in this direction with a Hessian approximation.