loclda(x, ...)## S3 method for class 'formula':
loclda(formula, data, ..., subset, na.action)
## S3 method for class 'default':
loclda(x, grouping, weight.func = function(x) 1/exp(x),
k = nrow(x), weighted.apriori = TRUE, ...)
## S3 method for class 'data.frame':
loclda(x, ...)
## S3 method for class 'matrix':
loclda(x, grouping, ..., subset, na.action)
groups ~ x1 + x2 + ...
formula
are to be taken.formula
is not given).formula
principal argument is given.)
A factor specifying the class for each observation.TRUE
, class prior probabilities are computed
using local weights (see Details below). If FALSE
, equal priors for all classes
actually occurring in the train data are used.NA
s are found.
The default action is for the procedure to fail. An alternative is na.omit
which leads to rejection of cases with missing values on any required variablloclda.default
.loclda
containing the following components:weight.func
.k
.weighted.apriori
.loclda
generates an object of class loclda
(see Value below). As localization makes it necessary to build an
individual decision rule for each test observation,
this rule construction has to be handled by predict.loclda
.
For convenience, the rule building procedure is still described here.To classify a test observation $x_s$, only the k
nearest neighbours of
$x_s$ within the train data are used. Each of these k train observations
$x_i, i = 1,\dots,k$, is assigned a weight $w_i$ according to
$$w_i = K\left(\frac{||x_i-x_s||}{d_k}\right), i=1,\dots,k$$
where K is the weighting function given by weight.func
, $||x_i-x_s||$
is the euclidian distance of $x_i$ and $x_s$
and $d_k$ is the euclidian distance of $x_s$
to its $k$-th nearest neighbour.
With these weights for each class $A_g, g=1,\dots,G$,
its weighted empirical mean $\hat{\mu}_g$ and weighted empirical
covariance matrix are computed. The estimated pooled (weighted) covariance matrix
$\hat{\Sigma}$ is then calculated from the individual weighted
empirical class covariance matrices. If weighted.apriori
is TRUE
(the default),
prior class probabilities are estimated according to:
$$prior_g := \frac{\sum_{i=1}^k \left(w_i \cdot I (x_i \in A_g)\right)}{\sum_{i=1}^k \left( w_i \right)}$$
where I is the indicator function. If FALSE
, equal priors for all classes are used.
In analogy to Linear Discriminant Analysis, the decision rule for $x_s$ is
$$\hat{A} := argmax_{g \in 1,\dots,G} (posterior_g)$$
where $$posterior_g := prior_g \cdot \exp{\left( (-\frac{1}{2}) t(x_s-\hat{\mu}_g)\hat{\Sigma}^{-1}(x_s-\hat{\mu}_g)\right)}$$
If $posterior_g < 10^{(-150)} \forall g \in {1,\dots,G}$,
$posterior_g$ is set to $\frac{1}{G}$ for all $g \in 1,\dots,G$
and the test observation $x_s$ is simply assigned to the class whose weighted mean has the lowest
euclidian distance to $x_s$.
predict.loclda
,
lda
benchB3("lda")$l1co.error
benchB3("loclda")$l1co.error
Run the code above in your browser using DataLab