fda
From mda v0.4-8
by Trevor Hastie
Flexible Discriminant Analysis
Flexible discriminant analysis.
- Keywords
- classif
Usage
fda(formula, data, weights, theta, dimension, eps, method, keep.fitted, ...)
Arguments
- formula
- of the form
y~x
it describes the response and the predictors. The formula can be more complicated, such asy~log(x)+z
etc (seeformula
for more details). The response should be a factor representing the response variable, or any vector that can be coerced to such (such as a logical variable). - data
- data frame containing the variables in the formula (optional).
- weights
- an optional vector of observation weights.
- theta
- an optional matrix of class scores, typically with less
than
J-1
columns. - dimension
- The dimension of the solution, no greater than
J-1
, whereJ
is the number classes. Default isJ-1
. - eps
- a threshold for small singular values for excluding
discriminant variables; default is
.Machine$double.eps
. - method
- regression method used in optimal scaling. Default is
linear regression via the function
polyreg
, resulting in linear discriminant analysis. Other possibilities aremars
andbruto
. For Penalized Discriminant analysisgen.ridge
is appropriate. - keep.fitted
- a logical variable, which determines whether the
(sometimes large) component
"fitted.values"
of thefit
component of the returned fda object should be kept. The default isTRUE
ifn * dimension < 5000
. - ...
- additional arguments to
method
.
Value
-
an object of class
- percent.explained
- the percent between-group variance explained by each dimension (relative to the total explained.)
- values
- optimal scaling regression sum-of-squares for each
dimension (see reference). The usual discriminant analysis
eigenvalues are given by
values / (1-values)
, which are used to definepercent.explained
. - means
- class means in the discriminant space. These are also
scaled versions of the final theta's or class scores, and can be
used in a subsequent call to
fda
(this only makes sense if some columns of theta are omitted---see the references). - theta.mod
- (internal) a class scoring matrix which allows
predict
to work properly. - dimension
- dimension of discriminant space.
- prior
- class proportions for the training data.
- fit
- fit object returned by
method
. - call
- the call that created this object (allowing it to be
update
-able) - confusion
- confusion matrix when classifying the training data. The
"fda"
. Use predict
to extract
discriminant variables, posterior probabilities or predicted class
memberships. Other extractor functions are coef
,
confusion
and plot
.The object has the following components:
method
functions are required to take arguments x
and y
where both can be matrices, and should produce a matrix
of fitted.values
the same size as y
. They can take
additional arguments weights
and should all have a ...
for safety sake. Any arguments to method
can be passed on via
the ...
argument of fda
. The default method
polyreg
has a degree
argument which allows
polynomial regression of the required total degree. See the
documentation for predict.fda
for further requirements
of method
. The package earth
is suggested for this
package as well; earth
is a more detailed implementation of
the mars model, and works as a method
argument.
References
``Flexible Disriminant Analysis by Optimal Scoring'' by Hastie, Tibshirani and Buja, 1994, JASA, 1255-1270.
``Penalized Discriminant Analysis'' by Hastie, Buja and Tibshirani, 1995, Annals of Statistics, 73-102.
``Elements of Statisical Learning - Data Mining, Inference and Prediction'' (2nd edition, Chapter 12) by Hastie, Tibshirani and Friedman, 2009, Springer
See Also
predict.fda
,
plot.fda
,
mars
,
bruto
,
polyreg
,
softmax
,
confusion
,
Examples
data(iris)
irisfit <- fda(Species ~ ., data = iris)
irisfit
## fda(formula = Species ~ ., data = iris)
##
## Dimension: 2
##
## Percent Between-Group Variance Explained:
## v1 v2
## 99.12 100.00
##
## Degrees of Freedom (per dimension): 5
##
## Training Misclassification Error: 0.02 ( N = 150 )
confusion(irisfit, iris)
## Setosa Versicolor Virginica
## Setosa 50 0 0
## Versicolor 0 48 1
## Virginica 0 2 49
## attr(, "error"):
## [1] 0.02
plot(irisfit)
coef(irisfit)
## [,1] [,2]
## [1,] -2.126479 -6.72910343
## [2,] -0.837798 0.02434685
## [3,] -1.550052 2.18649663
## [4,] 2.223560 -0.94138258
## [5,] 2.838994 2.86801283
marsfit <- fda(Species ~ ., data = iris, method = mars)
marsfit2 <- update(marsfit, degree = 2)
marsfit3 <- update(marsfit, theta = marsfit$means[, 1:2])
## this refits the model, using the fitted means (scaled theta's)
## from marsfit to start the iterations
Community examples
Looks like there are no examples yet.