ann: Fit Artificial Neural Networks.

Description

Fits a single hidden layer ANN model to input data x and output data y.

Usage

ann(x, y, size, act_hid = c("tanh", "sigmoid", "linear", "exp"),
  act_out = c("linear", "sigmoid", "tanh", "exp"), Wts = NULL, rang = 0.5,
  objfn = NULL, method = "BFGS", maxit = 1000, abstol = 1e-04,
  reltol = 1e-08, trace = TRUE, ...)

Arguments

matrix, data frame or vector of numeric input values, with ncol(x) equal to the number of inputs/predictors and nrow(x) equal to the number of examples. A vector is considered to comprise examples of a single input or predictor variable.

matrix, data frame or vector of target values for examples.

size

number of hidden layer nodes. Can be zero.

act_hid

activation function to be used at the hidden layer. See `Details'.

act_out

activation function to be used at the output layer. See `Details'.

Wts

initial weight vector. If NULL chosen at random.

rang

initial random weights on [-rang,rang]. Default value is 0.5.

objfn

objective function to be minimised when fitting weights. This function may be user-defined with the first two arguments corresponding to y (the observed target data) and y_hat (the ANN output). If this function has additional parameters which require optimizing, these must be defined in argument par_of (see AR(1) case in `Examples'). Default is sse (internal function to compute sum squared error, with error given by y - y_hat) when objfn = NULL.

method

the method to be used by optim for minimising the objective function. May be ``Nelder-Mead'', ``BFGS'', ``CG'', ``L-BFGS-B'', ``SANN'' or ``Brent''. Can be abbreviated. Default is ``BFGS''.

maxit

maximum number of iterations used by optim. Default value is 1000.

abstol

absolute convergence tolerance (stopping criterion) used by optim. Default is 1e-4.

reltol

relative convergence tolerance (stopping criterion) used by optim. Optimization stops if the value returned by objfn cannot be reduced by a factor of reltol * (abs(val) + reltol) at a step. Default is 1e-8.

trace

logical. Should optimization be traced? Default = TRUE.

…

arguments to be passed to user-defined objfn. Initial values of any parameters (in addition to the ANN weights) requiring optimization must be supplied in argument par_of (see AR(1) case in `Examples').

Value

object of class `ann' with components describing the ANN structure and the following output components:

wts

best set of weights found.

par_of

best values of additional objfn parameters. This component will only be returned if a user-defined objfn is supplied and argument par_of is included in the function call (see AR(1) case in `Examples').

value

value of objective function.

fitted.values

fitted values for the training data.

residuals

residuals for the training data.

convergence

integer code returned by optim. 0 indicates successful completion, see optim for possible error codes.

derivs

matrix of derivatives of hidden (columns 1:size) and output (final column) nodes.

Details

The ``linear'' activation, or transfer, function is the identity function where the output of a node is equal to its input \(f(x) = x\). The ``sigmoid'' function is the standard logistic sigmoid function given by \(f(x) = \frac{1}{1+e^{-x}}\). The ``tanh'' function is the hyperbolic tangent function given by \(f(x) = \frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}\) The ``exp'' function is the exponential function given by \(f(x) = e^{x}\) The default configuration of activation functions is act_hid = "tanh" and act_out = "linear". Optimization (minimization) of the objective function (objfn) is performed by optim using the method specified. Derivatives returned are first-order partial derivatives of the hidden and output nodes with respect to their inputs. These may be useful for sensitivity analyses.

Examples

Run this code

## fit 1-hidden node ann model with tanh activation at the hidden layer and
## linear activation at the output layer.
## Use 200 random samples from ar9 dataset.
## ---
data("ar9")
samp <- sample(1:1000, 200)
y <- ar9[samp, ncol(ar9)]
x <- ar9[samp, -ncol(ar9)]
x <- x[, c(1,4,9)]
fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1)

## fit 3-hidden node ann model to ar9 data with user-defined AR(1) objective
## function
## ---
ar1_sse <- function(y, y_hat, par_of) {
  err <- y - y_hat
  err[-1] <- err[-1] - par_of * err[-length(y)]
  sum(err ^ 2)
}
fit <- ann(x, y, size = 3, act_hid = "tanh", act_out = "linear", rang = 0.1,
           objfn = ar1_sse, par_of = 0.7)