Fit deep neural network with optional pre-training and fine-tuning.
# S3 method for default
darch(x, y, layers = NULL, ..., xValid = NULL,
yValid = NULL, scale = F, normalizeWeights = F, rbm.batchSize = 1,
rbm.trainOutputLayer = T, rbm.learnRateWeights = 0.1,
rbm.learnRateBiasVisible = 0.1, rbm.learnRateBiasHidden = 0.1,
rbm.weightCost = 2e-04, rbm.initialMomentum = 0.5,
rbm.finalMomentum = 0.9, rbm.momentumSwitch = 5,
rbm.visibleUnitFunction = sigmUnitFunc,
rbm.hiddenUnitFunction = sigmUnitFuncSwitch,
rbm.updateFunction = rbmUpdate, rbm.errorFunction = mseError,
rbm.genWeightFunction = generateWeights, rbm.numCD = 1,
rbm.numEpochs = 0, darch = NULL, darch.batchSize = 1,
darch.bootstrap = T, darch.genWeightFunc = generateWeights,
darch.logLevel = INFO, darch.fineTuneFunction = backpropagation,
darch.initialMomentum = 0.5, darch.finalMomentum = 0.9,
darch.momentumSwitch = 5, darch.learnRateWeights = 0.1,
darch.learnRateBiases = 0.1, darch.errorFunction = mseError,
darch.dropoutInput = 0, darch.dropoutHidden = 0,
darch.dropoutOneMaskPerEpoch = F,
darch.layerFunctionDefault = sigmoidUnitDerivative,
darch.layerFunctions = list(),
darch.layerFunction.maxout.poolSize = getOption("darch.unitFunction.maxout.poolSize",
NULL), darch.isBin = F, darch.isClass = T, darch.stopErr = -Inf,
darch.stopClassErr = -Inf, darch.stopValidErr = -Inf,
darch.stopValidClassErr = -Inf, darch.numEpochs = 0,
darch.retainData = T, dataSet = NULL, dataSetValid = NULL,
gputools = T)
Input data.
Target data.
Vector containing one integer for the number of neurons of each
layer. Defaults to c(a
, 10, b
), where a
is the number
of columns in the training data and b
the number of columns in the
targets.
additional parameters
Validation input data.
Validation target data.
Logical or logical vector indicating whether or which columns to scale.
Logical indicating whether to normalize weights (L2 norm = 1).
Pre-training batch size.
Logical indicating whether to train the output layer RBM as well (only useful for unsupervised fine-tuning).
Learn rate for the weights during pre-training.
Learn rate for the weights of the visible bias.
Learn rate for the weights of the hidden bias.
Pre-training weight cost. Higher values result in lower weights.
Initial momentum during pre-training.
Final momentum during pre-training.
Epoch during which momentum is switched from the initial to the final value.
Visible unit function during pre-training.
Hidden unit function during pre-training.
Update function during pre-training.
Error function during pre-training.
Function to generate the initial RBM weights.
Number of full steps for which contrastive divergence is performed.
Number of pre-training epochs.
Batch size, i.e. the number of training samples that are presented to the network before weight updates are performed (for both pre-training and fine-tuning).
Logical indicating whether to use bootstrapping to create a training and validation data set from the given data.
Function to generate the initial weights of the DBN.
Log level. futile.logger::INFO
by default.
Fine-tuning function.
Initial momentum during fine-tuning.
Final momentum during fine-tuning.
Epoch at which to switch from the intial to the final momentum value.
Learn rate for the weights during fine-tuning.
Learn rate for the biases during fine-tuning.
Error function during fine-tuning.
Dropout rate on the network input.
Dropout rate on the hidden layers.
Whether to generate a new mask for each
batch (FALSE
, default) or for each epoch (TRUE
).
Default activation function for the DBN layers.
A list of activation functions, names() should be a character vector of layer numbers. Note that layer 1 signifies the layer function between layers 1 and 2, i.e. the output of layer 2. Layer 1 does not have a layer function, since the input values are used directly.
Pool size for maxout units, when using the maxout acitvation function.
Whether network outputs are to be treated as binary values.
Whether classification errors should be printed during fine-tuning.
When the value of the error function is lower than or equal to this value, training is stopped.
When the classification error is lower than or equal to this value, training is stopped (0..100).
When the value of the error function on the validation data is lower than or equal to this value, training is stopped.
When the classification error on the validation data is lower than or equal to this value, training is stopped (0..100).
Number of epochs of fine-tuning.
Logical indicating whether to use gputools for matrix multiplication, if available.
Other darch interface functions: darch.DataSet
;
darch.formula
; darch
;
predict.DArch
, predict.darch
;
print.DArch
, print.darch