Some examples of training a neural network using simple randomly generated data. The training process is visualized through plots. Most parameters can be adjusted so that the effect of changes can be assessed by inspecting the plots.
example_NN(example_type = "nested", example_n = 500, example_sdnoise = 1,
example_nframes = 30, hiddenLayers = c(5, 5), lossFunction = "log",
dHuber = 1, rectifierLayers = NA, sigmoidLayers = NA,
regression = FALSE, standardize = TRUE, learnRate = 0.001,
maxEpochs = 2000, batchSize = 10, momentum = 0.3, L1 = 1e-07,
L2 = 1e-04)
which example to use. Possible values are surface
,
polynomial
, nested
, linear
, disjoint
and multiclass
number of observations to generate
standard deviation of random normal noise to be added to data
number of frames to be plotted
vector specifying the number of nodes in each layer. Set
to NA
for a Network without any hidden layers
which loss function should be used. Options are "log", "quadratic", "absolute", "huber" and "pseudo-huber"
used only in case of loss functions "huber" and "pseudo-huber". This parameter controls the cut-off point between quadratic and absolute loss.
vector or integer specifying which layers should have rectifier activation in its nodes
vector or integer specifying which layers should have sigmoid activation in its nodes
logical indicating regression or classification
logical indicating if X and y should be standardized before
training the network. Recommended to leave at TRUE
for faster
convergence.
the size of the steps made in gradient descent. If set too large, optimization can become unstable. Is set too small, convergence will be slow.
the maximum number of epochs (one iteration through training data).
the number of observations to use in each batch. Batch learning is computationally faster than stochastic gradient descent. However, large batches might not result in optimal learning, see Le Cun for details.
numeric value specifying how much momentum should be used. Set to zero for no momentum, otherwise a value between zero and one.
L1 regularization. Non-negative number. Set to zero for no regularization.
L2 regularization. Non-negative number. Set to zero for no regularization.
One regression example and three classification examples are included. More
examples will be added in future versions of ANN
.