B.Introduction-FuzzyRoughSets: Introduction to Fuzzy Rough Set Theory

Description

This part introduces briefly fuzzy rough set theory (FRST) and its application to data analysis. Since recently there are a lot of FRST variants that have been proposed by researchers, in this introduction, we only provide some basic concepts of FRST based on (Radzikowska and Kerre, 2002).

Arguments

Details

Just like in RST (see A.Introduction-RoughSets), a data set is represented as a table called an information system $\mathcal{A} = (U, A)$, where $U$ is a non-empty set of finite objects as the universe of discourse (note: it refers to all instances/experiments/rows in datasets) and $A$ is a non-empty finite set of attributes, such that $a : U \to V_{a}$ for every $a \in A$. The set $V_{a}$ is the set of values that attribute $a$ may take. Information systems that involve a decision attribute, containing classes or decision values of each objects, are called decision systems (or said as decision tables). More formally, it is a pair $\mathcal{A} = (U, A \cup {d})$, where $d \notin A$ is the decision attribute. The elements of $A$ are called conditional attributes. However, different from RST, FRST has several ways to express indiscernibility. Fuzzy indiscernibility relation (FIR) is used for any fuzzy relation that determines the degree to which two objects are indiscernible. We consider some special cases of FIR.

fuzzy tolerance relation: this relation has properties which are reflexive and symmetric where reflexive:$R(x,x) = 1$symmetric:$R(x,y) = R(y,x)$
similarity relation (also called fuzzy equivalence relation): this relation has properties not only reflexive and symmetric but also transitive defined as$min(R(x,y), R(y,z)) \le R(x,z)$
$\mathcal{T}$-similarity relation (also called fuzzy$\mathcal{T}$-equivalence relation): this relation is a fuzzy tolerance relation that is also$\mathcal{T}$-transitive.$\mathcal{T}(R(x,y), R(y,z)) \le R(x,z)$, for a given triangular norm$\mathcal{T}$.

The following equations are the tolerance relations on a quantitative attribute $a$, $R_a$, proposed by (Jensen and Shen, 2009).

eq.1:$R_a(x,y) = 1 - \frac{|a(x) - a(y)|}{|a_{max} - a_{min}|}$
eq.2:$R_a(x,y) = exp(-\frac{(a(x) - a(y))^2}{2 \sigma_a^2})$
eq.3:$R_a(x,y) = max(min(\frac{a(y) - a(x) + \sigma_a}{\sigma_a}, \frac{a(x) - a(y) + \sigma_a}{\sigma_a}), 0)$

where $\sigma_{a}^2$ is the variance of feature $a$ and $a_{min}$ and $a_{max}$ are the minimal and maximal values of data supplied by user. Additionally, other relations have been implemented in BC.IND.relation.FRST For a qualitative (i.e., nominal) attribute $a$, the classical manner of discerning objects is used, i.e., $R_a(x,y) = 1$ if $a(x) = a(y)$ and $R_a(x,y) = 0$, otherwise. We can then define, for any subset $B$ of $A$, the fuzzy $B$-indiscernibility relation by $R_B(x,y) = \mathcal{T}(R_a(x,y))$, where $\mathcal{T}$ is a t-norm operator, for instance minimum, product and Lukasiewicz t-norm. In general, $\mathcal{T}$ can be replaced by any aggregation operator, like e.g., the average. In the context of FRST, according to (Radzikowska and Kerre, 2002) lower and upper approximation are generalized by means of an implicator $\mathcal{I}$ and a t-norm $\mathcal{T}$. The following are the fuzzy $B$-lower and $B$-upper approximations of a fuzzy set $A$ in $U$ $(R_B \downarrow A)(y) = inf_{x \in U} \mathcal{I}(R_B(x,y), A(x))$ $(R_B \uparrow A)(y) = sup_{x \in U} \mathcal{T}(R_B(x,y), A(x))$ The underlying meaning is that $R_B \downarrow A$ is the set of elements necessarily satisfying the concept (strong membership), while $R_B \uparrow A$ is the set of elements possibly belonging to the concept (weak membership). Many other ways to define the approximations can be found in BC.LU.approximation.FRST. Mainly, these were designed to deal with noise in the data and to make the approximations more robust. Based on fuzzy $B$-indiscernibility relations, we define the fuzzy $B$-positive region by, for $y \in X$, $POS_B(y) = (\cup_{x \in U} R_B \downarrow R_dx)(y)$ We can define the degree of dependency of $d$ on $B$, $\gamma_{B}$ by $\gamma_{B} = \frac{|POS_{B}|}{|U|} = \frac{\sum_{x \in U} POS_{B}(x)}{|U|}$ A decision reduct is a set $B \subseteq A$ such that $\gamma_{B} = \gamma_{A}$ and $\gamma_{B'} = \gamma_{B}$ for every $B' \subset B$. As we know from rough set concepts (See A.Introduction-RoughSets), we are able to calculate the decision reducts by constructing the decision-relative discernibility matrix. Based on (Tsang et al, 2008), the discernibility matrix can be defined as follows. The discernibility matrix is an $n \times n$ matrix $(c_{ij})$ where for $i,j = 1, \ldots, n$ 1) $c_{ij}= {a \in A : 1 - R_{a}(x_i, x_j) \ge \lambda_i}$ if $\lambda_j < \lambda_i$. 2) $c_{ij}={\oslash}$, otherwise. with $\lambda_i = (R_A \downarrow R_{d}x_{i})(x_i)$ and $\lambda_j = (R_A \downarrow R_{d}x_{j})(x_{j})$ Other approaches of discernibility matrix can be read at BC.discernibility.mat.FRST. The other implementations of the FRST concepts can be seen at BC.IND.relation.FRST, BC.LU.approximation.FRST, and BC.positive.reg.FRST.

References

A. M. Radzikowska and E. E. Kerre, "A Comparative Study of Fuzzy Rough Sets", Fuzzy Sets and Systems, vol. 126, p. 137 - 156 (2002). D. Dubois and H. Prade, "Rough Fuzzy Sets and Fuzzy Rough Sets", International Journal of General Systems, vol. 17, p. 91 - 209 (1990). E. C. C. Tsang, D. G. Chen, D. S. Yeung, X. Z. Wang, and J. W. T. Lee, "Attributes Reduction Using Fuzzy Rough Sets", IEEE Trans. Fuzzy Syst., vol. 16, no. 5, p. 1130 - 1141 (2008). L. A. Zadeh, "Fuzzy Sets", Information and Control, vol. 8, p. 338 - 353 (1965). R. Jensen and Q. Shen, "New Approaches to Fuzzy-Rough Feature Selection", IEEE Trans. on Fuzzy Systems, vol. 19, no. 4, p. 824 - 838 (2009). Z. Pawlak, "Rough Sets", International Journal of Computer and Information Sciences, vol. 11, no. 5, p. 341 - 356 (1982).