nnet
Fit Neural Networks
Fit single-hidden-layer neural network, possibly with skip-layer connections.
- Keywords
- neural
Usage
nnet(x, ...)## S3 method for class 'formula':
nnet(formula, data, weights, \dots,
subset, na.action, contrasts = NULL)
## S3 method for class 'default':
nnet(x, y, weights, size, Wts, mask,
linout = FALSE, entropy = FALSE, softmax = FALSE,
censored = FALSE, skip = FALSE, rang = 0.7, decay = 0,
maxit = 100, Hess = FALSE, trace = TRUE, MaxNWts = 1000,
abstol = 1.0e-4, reltol = 1.0e-8, \dots)
Arguments
- formula
- A formula of the form
class ~ x1 + x2 + ...
- x
- matrix or data frame of
x
values for examples. - y
- matrix or data frame of target values for examples.
- weights
- (case) weights for each example -- if missing defaults to 1.
- size
- number of units in the hidden layer. Can be zero if there are skip-layer units.
- data
- Data frame from which variables specified in
formula
are preferentially to be taken. - subset
- An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
- na.action
- A function to specify the action to be taken if
NA
s are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this - contrasts
- a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.
- Wts
- initial parameter vector. If missing chosen at random.
- mask
- logical vector indicating which parameters should be optimized (default all).
- linout
- switch for linear output units. Default logistic output units.
- entropy
- switch for entropy (= maximum conditional likelihood) fitting. Default by least-squares.
- softmax
- switch for softmax (log-linear model) and maximum conditional
likelihood fitting.
linout
,entropy
,softmax
andcensored
are mutually exclusive. - censored
- A variant on
softmax
, in which non-zero targets mean possible classes. Thus forsoftmax
a row of(0, 1, 1)
means one example each of classes 2 and 3, but forcensored
it means one example whose class is - skip
- switch to add skip-layer connections from input to output.
- rang
- Initial random weights on [-
rang
,rang
]. Value about 0.5 unless the inputs are large, in which case it should be chosen so thatrang
* max(|x|
) is about 1. - decay
- parameter for weight decay. Default 0.
- maxit
- maximum number of iterations. Default 100.
- Hess
- If true, the Hessian of the measure of fit at the best set of weights
found is returned as component
Hessian
. - trace
- switch for tracing optimization. Default
TRUE
. - MaxNWts
- The maximum allowable number of weights. There is no intrinsic limit
in the code, but increasing
MaxNWts
will probably allow fits that are very slow and time-consuming. - abstol
- Stop if the fit criterion falls below
abstol
, indicating an essentially perfect fit. - reltol
- Stop if the optimizer is unable to reduce the fit criterion by a
factor of at least
1 - reltol
. - ...
- arguments passed to or from other methods.
Details
If the response in formula
is a factor, an appropriate classification
network is constructed; this has one output and entropy fit if the
number of levels is two, and a number of outputs equal to the number
of classes and a softmax output stage for more levels. If the
response is not a factor, it is passed on unchanged to nnet.default
.
Optimization is done via the BFGS method of optim
.
Value
- object of class
"nnet"
or"nnet.formula"
. Mostly internal structure, but has components wts the best set of weights found value value of fitting criterion plus weight decay term. fitted.values the fitted values for the training data. residuals the residuals for the training data. convergence 1
if the maximum number of iterations was reached, otherwise0
.
References
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
Examples
# use half the iris data
ir <- rbind(iris3[,,1],iris3[,,2],iris3[,,3])
targets <- class.ind( c(rep("s", 50), rep("c", 50), rep("v", 50)) )
samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25))
ir1 <- nnet(ir[samp,], targets[samp,], size = 2, rang = 0.1,
decay = 5e-4, maxit = 200)
test.cl <- function(true, pred) {
true <- max.col(true)
cres <- max.col(pred)
table(true, cres)
}
test.cl(targets[-samp,], predict(ir1, ir[-samp,]))
# or
ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),
species = factor(c(rep("s",50), rep("c", 50), rep("v", 50))))
ir.nn2 <- nnet(species ~ ., data = ird, subset = samp, size = 2, rang = 0.1,
decay = 5e-4, maxit = 200)
table(ird$species[-samp], predict(ir.nn2, ird[-samp,], type = "class"))