# pvs

##### Pairwise variable selection for classification

Pairwise variable selection for numerical data, allowing the use of different classifiers and different variable selection methods.

- Keywords
- multivariate, classif

##### Usage

```
pvs(x, ...)
## S3 method for class 'default':
pvs(x, grouping, prior=NULL, method="lda",
vs.method=c("ks.test","stepclass","greedy.wilks"), niveau=0.05,
fold=10, impr=0.1, direct="backward", out=FALSE, ...)
## S3 method for class 'formula':
pvs(formula, data = NULL, ...)
```

##### Arguments

- x
- matrix or data frame containing the explanatory variables
(required, if
`formula`

is not given). x must consist of numerical data only. - formula
- A formula of the form
`groups ~ x1 + x2 + ...`

. That is, the response is the grouping factor (the classes) and the right hand side specifies the (numerical) discriminators. Interaction terms are not supported. - data
- data matrix (rows=cases, columns=variables)
- grouping
- class indicator vector (a factor)
- prior
- prior probabilites for the classes. If not specified the prior probabilities will be set according to proportion in
grouping . If specified the order of prior probabilities must be the same as ingrouping . - method
- character, name of classification function (e.g.
(default)).`lda`

- vs.method
- character, name of variable selection method. Must be one of
(default),`ks.test`

or`stepclass`

- niveau
- used niveau for
`ks.test`

- fold
- parameter for cross-validation, if
is chosen`stepclass`

`vs.method`

- impr
- least improvement of performance measure desired to include or exclude any variable (<=1), if=""
`stepclass`

is chosen`vs.method`

- direct
- direction of variable selection, if
is chosen`stepclass`

. Must be one if`vs.method`

,`forward`

`backward`

- out
- indicator (logical) for textoutput during computation (slows down computation!), if
is chosen`stepclass`

`vs.method`

- ...
- further parameters passed to classification function (
) or variable selection method (`method`

)`vs.method`

##### Details

The classification `lda`

`predict`

`predict.lda`

`lda`

`posterior`

`method(x, grouping, ...)`

.
Examples of such classification methods are `lda`

`qda`

`rda`

`NaiveBayes`

`sknn`

`svm`

`randomForest`

`pvs`

`predict`

`stepclass`

`pvs`

`vs.method`

`x`

`method`

`method`

`ks.test`

`ks.test`

`niveau`

`method`

`stepclass`

`stepclass`

`method`

`greedy.wilks`

##### Value

- An object of class
containing the following components:`pvs`

classes the classes in grouping prior used prior probabilities method name of used classification function vs.method name of used function for variable selection submodels containing a list of submodels. For each pair of classes there is a list element being another list of 3 containing the class-pair of this submodel, the selected variables for the subspace of classes and the result of the trained classification function. call the (matched) function call

##### concept

Pairwise variable selection for classification

##### References

*From Data and Information Analysis to Kwnowledge Engineering.*, eds Spiliopolou, M., Kruse, R., Borgelt, C., Nuernberger, A. and Gaul, W. pp. 700-708. Springer, Heidelberg.}

##### See Also

`predict.pvs`

for predicting `pvs`

`locpvs`

for pairwisevariable selection in local models of several subclasses

##### Examples

```
## Example 1: learn an "lda" model on the waveform data using pairwise variable
## selection (pvs) using "ks.test" and compare it to using lda without pvs
library("mlbench")
trainset <- mlbench.waveform(300)
pvsmodel <- pvs(trainset$x, trainset$classes, niveau=0.05) # default: using method="lda"
## short summary, showing the class-pairs of the submodels and the selected variables
pvsmodel
testset <- mlbench.waveform(500)
## prediction of the test data set:
prediction <- predict(pvsmodel, testset$x)
## calculating the test error rate
1-sum(testset$classes==prediction$class)/length(testset$classes)
## Bayes error is 0.149
## comparison to performance of simple lda
ldamodel <- lda(trainset$x, trainset$classes)
LDAprediction <- predict(ldamodel, testset$x)
## test error rate
1-sum(testset$classes==LDAprediction$class)/length(testset$classes)
## Example 2: learn a "qda" model with pvs on half of the Satellite dataset,
## using "ks.test"
library("mlbench")
data("Satellite")
model <- pvs(classes ~ ., Satellite[1:3218,], method="qda", vs.method="ks.test")
## short summary, showing the class-pairs of the submodels and the selected variables
model
## now predict on the rest of the data set:
## pred <- predict(model,Satellite[3219:6435,]) # takes some time
pred <- predict(model,Satellite[3219:6435,], quick=TRUE) # that's much quicker
## now you can look at the predicted classes:
pred$class
## or the posterior probabilities:
pred$posterior
```

*Documentation reproduced from package klaR, version 0.6-11, License: GPL-2*