FS.greedy.heuristic.reduct.RST: The greedy heuristic algorithm for determining a reduct

Description

This is a function used for implementing a greedy heuristic method for feature selection based on RST.

Usage

FS.greedy.heuristic.reduct.RST(decision.table,
    decisionIdx = ncol(decision.table), qualityF = X.gini,
    nAttrs = NULL, epsilon = 0, ...)

Arguments

decision.table

a "DecisionTable" class representing the decision table. See SF.asDecisionTable.

decisionIdx

an integer value representing an index of the decision attribute.

qualityF

a function representing the quality of subset of attributes. In this package, the following functions have been included:

X.entropy: SeeX.entropy.
X.gini: See

nAttrs

a vector representing indexes of conditional attributes.

epsilon

a numeric value between [0, 1] representing whether it is using approximate reducts or not.

...

other parameters.

Value

A class "FeatureSubset" that contains the following components:
- reduct: a list representing a single reduct. In this case, it could be a superreduct or just a subset of features.
- type.method: a string representing the type of method which is"greedy.heuristic".
- type.task: a string showing the type of task which is"feature selection".
- model: a string representing the type of model. In this case, it is"RST"which means rough set theory.

Details

In this function, we have provided some quality measures of subsets of attributes. The measure are important to determine the quality of a subset to be a reduct. For example, X.entropy is a measure of information gain. We select one of the measures by assigning the qualityF parameter.

Additionally, this function has implemented $\epsilon$-approximate reducts. It means that the method attempts to approximate the original decision model by producing an approximate reduct which is subset of attributes. The $\epsilon$-approximate can be defined as

$Disc_{\mathcal{A}}(B) \ge (1 - \epsilon)Disc_{\mathcal{A}}(A)$

where $Disc_{\mathcal{A}}(B)$ is the discernibility measure of attributes $B$ in decision table $\mathcal{A}$ and $\epsilon$ is numeric value between 0 and 1. A lot of monographs provide comprehensive explanations about this topics, for example (A. Janusz and S. Stawicki, 2011; D. Slezak, 2002; J. Wroblewski, 2001) which are used as the references of this function.

Additionally, SF.applyDecTable has been provided to generate new decision table.

References

A. Janusz and S. Stawicki, "Applications of Approximate Reducts to the Feature Selection Problem", Proceedings of International Conference on Rough Sets and Knowledge Technology ({RSKT}), vol. 6954, p. 45 - 50 (2011).

D. Slezak, "Approximate Entropy Reducts", Fundamenta Informaticae, vol. 53, no. 3 - 4, p. 365 - 390 (2002).

J. Wroblewski, "Ensembles of Classifiers Based on Approximate Reducts", Fundamenta Informaticae, vol. 47, no. 3 - 4, p. 351 - 360 (2001).

Examples

Run this code

###################################################
## Example 1: Evaluate reduct and generate
##            new decision table
###################################################
data(RoughSetData)
decision.table <- RoughSetData$hiring.dt

## evaluate a single reduct
res.1 <- FS.greedy.heuristic.reduct.RST(decision.table, qualityF = X.entropy,
                                        epsilon = 0.0)

## generate a new decision table corresponding to the reduct
new.decTable <- SF.applyDecTable(decision.table, res.1)

Run the code above in your browser using DataLab