ordEval: Evaluation of ordered attributes

Description

The method evaluates the quality of ordered attributes specified by the formula with ordEval algorithm.

Usage

ordEval(formula, data, file=NULL, rndFile=NULL, variant=c("allNear","attrDist1","classDist1"), ...)

Arguments

formula

Formula specifies the attributes to be evaluated and the target variable.

data

Data frame with evaluation data.

file

Name of file where evaluation results will be written to.

rndFile

Name of file where evaluation of random normalizing attributes will be written to.

variant

Name of the variant of ordEval algorithm. Can be any of "allNear", "attrDist1", or "classDist1".

...

Other options used also by other context-sensitive evaluation methods (e.g., ReliefF).

Value

The method returns a list with following components:
reinfPosAVa matrix of positive reinforcement for attributes' values,
reinfNegAVa matrix of negative reinforcement for attributes' values,
anchorAVa matrix of anchoring for attributes' values,
noAVa matrix containing count for each value of each attribute,
reinfPosAttra vector of positive reinforcement for attributes,
reinfNegAttra matrix of negative reinforcement for attributes,
anchorAttra matrix of anchoring for attributes,
noAVattra vector containing count of valid values of each attribute,
rndReinfPosAVa three dimensional array of statistics for random normalizing attributes' positive reinforcement for attributes' values,
rndReinfPosAVa three dimensional array of statistics for random normalizing attributes' negative reinforcement for attributes' values,
rndAnchorAVa three dimensional array of statistics for random normalizing attributes' anchoring for attributes' values,
rndReinfPosAttra three dimensional array of statistics for random normalizing attributes' positive reinforcement for attributes,
rndReinfPosAttra three dimensional array of statistics for random normalizing attributes' negative reinforcement for attributes,
rndAnchorAttra three dimensional array of statistics for random normalizing attributes' anchoring for attributes.
attrNamesthe names of attributes
valueNamesthe values of attributes
noAttrnumber of attributes
ordValmaximal number of attribute values
variantthe variant of the algorithm used
filethe file to store the results
rndFilethe file to store random normalizations
The statistics used are median, 1st quartile, 3rd quartile, low and high percentile selected by ordEvalNormalizingPercentile, mean, standard deviation, and expected probability according to value distribution. With these statistics we can visualize significance of reinforcements using adapted box and whiskers plot.

Details

Parameter formula is used as a mechanism to select features (attributes) and prediction variable (class). Only simple terms can be used and interaction expressed in formula syntax are not supported. The simplest way is to specify just response variable as parameter: class ~ .. In this case all the other columns in the data set are evaluated. See example below. The output can be optionally written to files file and rndFile, in a format used by visualization methods in plotOrdEval. The variant of the algorithm actually used is controlled with variant parameter which can have values "allNear", "attrDist1", and "classDist1". The default value is "allNear" which takes all nearest neighbors into account in evaluation of attributes. Variant "attrDist1" takes only neighbors with attribute value at most 1 different from current case into account (for each attribute separately). This makes sense when we want to see the thresholds of reinforcement, and therefore observe just small change up or down. The "classDist1" variant takes only neighbors with class value at most 1 different from current case into account. This makes sense if we want to observe strictly small changes in upward/downward reinforcement and has little effect in practical applications. There are some additional parameters ... some of which are common with other context-sensitive evaluation methods (e.g., ReliefF). Their list and short description is available in helpCore (see subsection on ordEval algorithm and attribute evaluation therein). Evaluation of attributes without specifics of ordered attributes is covered in function attrEval.

References

Marko Robnik-Sikonja, Koen Vanhoof: Evaluation of ordinal attributes at value level. Knowledge Discovery and Data Mining, 14:225-243, 2007

Marko Robnik-Sikonja, Igor Kononenko: Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning Journal, 53:23-69, 2003 Some of the references are available also from http://lkm.fri.uni-lj.si/rmarko/papers/

Examples

Run this code

#prepare a data set
dat <- ordDataGen(200)

# evaluate ordered features with ordEval
est <- ordEval(class ~ ., dat, ordEvalNoRandomNormalizers=100)
print(est)
printOrdEval(est)  
plot(est)

Run the code above in your browser using DataLab