pcaNNet.default: Neural Networks with a Principal Component Step

Description

Run PCA on a dataset, then use it in a neural network model

Usage

## S3 method for class 'default':
pcaNNet(x, y, thresh = 0.99, ...)
## S3 method for class 'formula':
pcaNNet(formula, data, weights, ..., 
        thresh = .99, subset, na.action, contrasts = NULL)
## S3 method for class 'pcaNNet':
predict(object, newdata, type = c("raw", "class"), ...)

Arguments

formula

A formula of the form class ~ x1 + x2 + ...

matrix or data frame of x values for examples.

matrix or data frame of target values for examples.

weights

(case) weights for each example -- if missing defaults to 1.

thresh

a threshold for the cumulative proportion of variance to capture from the PCA analysis. For example, to retain enough PCA components to capture 95 percent of variation, set thresh = .95

data

Data frame from which variables specified in formula are preferentially to be taken.

subset

An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)

na.action

A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this

contrasts

a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.

object

an object of class nnet as returned by nnet.

newdata

matrix or data frame of test examples. A vector is considered to be a row vector comprising a single case.

type

Type of output

...

arguments passed to nnet

Value

For pcaNNet, an object of "pcaNNet" or "pcaNNet.formula". Items of interest in the output are:
pcthe output from preProcess
modelthe model generated from nnet
namesif any predictors had only one distinct value, this is a character string of the remaining columns. Otherwise a value of NULL

Details

The function first will run principal component analysis on the data. The cumulative percentage of variance is computed for each principal component. The function uses the thresh argument to determine how many components must be retained to capture this amount of variance in the predictors.

The principal components are then used in a neural network model.

When predicting samples, the new data are similarly transformed using the information from the PCA analysis on the training data and then predicted. Because the variance of each predictor is used in the PCA analysis, the code does a quick check to make sure that each predictor has at least two distinct values. If a predictor has one unique value, it is removed prior to the analysis.

References

Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.

Examples

Run this code

data(BloodBrain)
modelFit <- pcaNNet(bbbDescr, logBBB, size = 5, linout = TRUE, trace = FALSE)
modelFit

predict(modelFit, bbbDescr)

Run the code above in your browser using DataLab