ppr
Projection Pursuit Regression
Fit a projection pursuit regression model.
 Keywords
 regression
Usage
ppr(x, ...)
"ppr"(formula, data, weights, subset, na.action, contrasts = NULL, ..., model = FALSE)
"ppr"(x, y, weights = rep(1, n), ww = rep(1, q), nterms, max.terms = nterms, optlevel = 2, sm.method = c("supsmu", "spline", "gcvspline"), bass = 0, span = 0, df = 5, gcvpen = 1, ...)
Arguments
 formula
 a formula specifying one or more numeric response variables and the explanatory variables.
 x
 numeric matrix of explanatory variables. Rows represent observations, and columns represent variables. Missing values are not accepted.
 y
 numeric matrix of response variables. Rows represent observations, and columns represent variables. Missing values are not accepted.
 nterms
 number of terms to include in the final model.
 data

a data frame (or similar: see
model.frame
) from which variables specified informula
are preferentially to be taken.  weights
 a vector of weights
w_i
for each case.  ww

a vector of weights for each response, so the fit criterion is
the sum over case
i
and responsesj
ofw_i ww_j (y_ij  fit_ij)^2
divided by the sum ofw_i
.  subset
 an index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
 na.action

a function to specify the action to be taken if
NA
s are found. The default action is given bygetOption("na.action")
. (NOTE: If given, this argument must be named.)  contrasts
 the contrasts to be used when any factor explanatory variables are coded.
 max.terms
 maximum number of terms to choose from when building the model.
 optlevel
 integer from 0 to 3 which determines the thoroughness of an optimization routine in the SMART program. See the ‘Details’ section.
 sm.method

the method used for smoothing the ridge functions. The default is to
use Friedman's super smoother
supsmu
. The alternatives are to use the smoothing spline code underlyingsmooth.spline
, either with a specified (equivalent) degrees of freedom for each ridge functions, or to allow the smoothness to be chosen by GCV.Can be abbreviated.
 bass

super smoother bass tone control used with automatic span selection
(see
supsmu
); the range of values is 0 to 10, with larger values resulting in increased smoothing.  span

super smoother span control (see
supsmu
). The default,0
, results in automatic span selection by local cross validation.span
can also take a value in(0, 1]
.  df

if
sm.method
is"spline"
specifies the smoothness of each ridge term via the requested equivalent degrees of freedom.  gcvpen

if
sm.method
is"gcvspline"
this is the penalty used in the GCV selection for each degree of freedom used.  ...
 arguments to be passed to or from other methods.
 model
 logical. If true, the model frame is returned.
Details
The basic method is given by Friedman (1984), and is essentially the
same code used by SPLUS's ppreg
. This code is extremely
sensitive to the compiler used.
The algorithm first adds up to max.terms
ridge terms one at a
time; it will use less if it is unable to find a term to add that makes
sufficient difference. It then removes the least
important term at each step until nterms
terms
are left.
The levels of optimization (argument optlevel
)
differ in how thoroughly the models are refitted during this process.
At level 0 the existing ridge terms are not refitted. At level 1
the projection directions are not refitted, but the ridge
functions and the regression coefficients are.
Levels 2 and 3 refit all the terms and are equivalent for one response; level 3 is more careful to rebalance the contributions from each regressor at each step and so is a little less likely to converge to a saddle point of the sum of squares criterion.
Value

A list with the following components, many of which are for use by the
method functions.
 call
 the matched call
 p
 the number of explanatory variables (after any coding)
 q
 the number of response variables
 mu
 the argument
nterms
 ml
 the argument
max.terms
 gof
 the overall residual (weighted) sum of squares for the selected model
 gofn
 the overall residual (weighted) sum of squares against the
number of terms, up to
max.terms
. Will be invalid (and zero) for less thannterms
.  df
 the argument
df
 edf
 if
sm.method
is"spline"
or"gcvspline"
the equivalent number of degrees of freedom for each ridge term used.  xnames
 the names of the explanatory variables
 ynames
 the names of the response variables
 alpha
 a matrix of the projection directions, with a column for each ridge term
 beta
 a matrix of the coefficients applied for each response to the ridge terms: the rows are the responses and the columns the ridge terms
 yb
 the weighted means of each response
 ys
 the overall scale factor used: internally the responses are
divided by
ys
to have unit total weighted sum of squares.  fitted.values
 the fitted values, as a matrix if
q > 1
.  residuals
 the residuals, as a matrix if
q > 1
.  smod
 internal work array, which includes the ridge functions evaluated at the training set points.
 model
 (only if
model = TRUE
) the model frame.
Source
Friedman (1984): converted to double precision and added interface to smoothing splines by B. D. Ripley, originally for the \href{https://CRAN.Rproject.org/package=#1}{\pkg{#1}}MASSMASS package.
References
Friedman, J. H. and Stuetzle, W. (1981) Projection pursuit regression. Journal of the American Statistical Association, 76, 817823.
Friedman, J. H. (1984) SMART User's Guide. Laboratory for Computational Statistics, Stanford University Technical Report No.\ifelse{latex}{\out{~}}{ } 1.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Springer.
See Also
Examples
library(stats)
require(graphics)
# Note: your numerical values may differ
attach(rock)
area1 < area/10000; peri1 < peri/10000
rock.ppr < ppr(log(perm) ~ area1 + peri1 + shape,
data = rock, nterms = 2, max.terms = 5)
rock.ppr
# Call:
# ppr.formula(formula = log(perm) ~ area1 + peri1 + shape, data = rock,
# nterms = 2, max.terms = 5)
#
# Goodness of fit:
# 2 terms 3 terms 4 terms 5 terms
# 8.737806 5.289517 4.745799 4.490378
summary(rock.ppr)
# ..... (same as above)
# .....
#
# Projection direction vectors:
# term 1 term 2
# area1 0.34357179 0.37071027
# peri1 0.93781471 0.61923542
# shape 0.04961846 0.69218595
#
# Coefficients of ridge terms:
# term 1 term 2
# 1.6079271 0.5460971
par(mfrow = c(3,2)) # maybe: , pty = "s")
plot(rock.ppr, main = "ppr(log(perm)~ ., nterms=2, max.terms=5)")
plot(update(rock.ppr, bass = 5), main = "update(..., bass = 5)")
plot(update(rock.ppr, sm.method = "gcv", gcvpen = 2),
main = "update(..., sm.method=\"gcv\", gcvpen=2)")
cbind(perm = rock$perm, prediction = round(exp(predict(rock.ppr)), 1))
detach()