Learn R Programming

CORElearn (version 0.9.29)

attrEval: Attribute evaluation

Description

The method evaluates the quality of the features/attributes/dependent variables specified by the formula with the selected heuristic method. Feature evaluation algorithms available for classification problems are various variants of Relief and ReliefF algorithms (ReliefF, cost-sensitive ReliefF, ...), gain ratio, gini-index, MDL, DKM, information gain, ... For regression problems there are RREliefF, MSE, MAE, ... Parallel execution on several cores is supported for speedup.

Usage

attrEval(formula, data, estimator, costMatrix = NULL, ...)

Arguments

formula
Formula specifying the predictors to be evaluated and the target variable.
data
Data frame with evaluation data.
estimator
The name of the evaluation method.
costMatrix
Optional cost matrix.
...
Additional options used by specific evaluation methods as described in helpCore .

Value

  • Vector of evaluations for the features in the order specified by the formula.

Details

Parameter formula is used as a mechanism to select features (attributes) and prediction variable (class). Only simple terms can be used and interaction expressed in formula syntax are not supported. The simplest way is to specify just response variable: class ~ .. In this case all other attributes in the data set are evaluated. See also example below. The optional parameter costMatrix can provide nonuniform cost matrix to classification cost-sensitive measures (ReliefFexpC, ReliefFavgC, ReliefFpe, ReliefFpa, ReliefFsmp,GainRatioCost, DKMcost, ReliefKukar, and MDLsmp). For other measures this parameter is ignored. The format of the matrix is costMatrix(true class, predicted class). By default a uniform costs are assumed, i.e., costMatrix(i, i) = 0, and costMatrix(i, j) = 1, for i not equal to j. The estimator parameter selects the evaluation heuristics. For classification problem it must be one of the names returned by infoCore(what="attrEval") and for regression problem it must be one of the names returned by infoCore(what="attrEvalReg") Majority of these feature evaluation measures are described in the references given below, here only a short description is given. For classification problem they are [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object] For regression problem the implemented measures are: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object] There are some additional parameters ... available which are used by specific evaluation heuristics. Their list and short description is available by calling helpCore. See Section on attribute evaluation. The attributes can also be evaluated via random forest out-of-bag set with function rfAttrEval. Evaluation and visualization of ordered attributes is covered in function ordEval.

References

Marko Robnik-Sikonja, Igor Kononenko: Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning Journal, 53:23-69, 2003 Marko Robnik-Sikonja: Experiments with Cost-sensitive Feature Evaluation. In Lavrac et al.(eds): Machine Learning, Proceedings of ECML 2003, Springer, Berlin, 2003, pp. 325-336 Igor Kononenko: On Biases in Estimating Multi-Valued Attributes. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI'95), pp. 1034-1040, 1995

Some of these references are available also from http://lkm.fri.uni-lj.si/rmarko/papers/

See Also

CORElearn, CoreModel, rfAttrEval, ordEval, helpCore, infoCore.

Examples

Run this code
# use iris data

# run method ReliefF with exponential rank distance  
estReliefF <- attrEval(Species ~ ., iris, 
                            estimator="ReliefFexpRank", ReliefIterations=30)
print(estReliefF)

# print all available estimators
infoCore(what="attrEval")

Run the code above in your browser using DataLab