A.Introduction-RoughSets: Introduction to Rough Set Theory

Description

This part attempts to introduce rough set theory (RST) and its application to data analysis. While the classical RST proposed by Pawlak in 1982 is explained in detail in this section, some recent advancements will be treated in the documentation of the related functions.

Arguments

Details

In RST, a data set is represented as a table called an information system $\mathcal{A} = (U, A)$, where $U$ is a non-empty set of finite objects known as the universe of discourse (note: it refers to all instances/rows in datasets) and $A$ is a non-empty finite set of attributes, such that $a : U \to V_{a}$ for every $a \in A$. The set $V_{a}$ is the set of values that attribute $a$ may take. Information systems that involve a decision attribute, containing classes for each object, are called decision systems or decision tables. More formally, it is a pair $\mathcal{A} = (U, A \cup {d})$, where $d \notin A$ is the decision attribute. The elements of $A$ are called conditional attributes. The information system representing all data in a particular system may contain redundant parts. It could happen because there are the same or indiscernible objects or some superfluous attributes. The indiscernibility relation is a binary relation showing the relation between two objects. This relation is an equivalence relation. Let $\mathcal{A} = (U, A)$ be an information system, then for any $B \subseteq A$ there is an equivalence relation $R_B(x,y)$: $R_B(x,y)= {(x,y) \in U^2 | \forall a \in B, a(x) = a(y)}$ If $(x,y) \in R_B(x,y)$, then $x$ and $y$ are indiscernible by attributes from $B$. The equivalence classes of the $B$-indiscernibility relation are denoted $[x]_{B}$. The indiscernibility relation will be further used to define basic concepts of rough set theory which are lower and upper approximations. Let $B \subseteq A$ and $X \subseteq U$, $X$ can be approximated using the information contained within $B$ by constructing the $B$-lower and $B$-upper approximations of $X$: $R_B \downarrow X = { x \in U | [x]_{B} \subseteq X }$ $R_B \uparrow X = { x \in U | [x]_{B} \cap X \not= \emptyset }$ The tuple $\langle R_B \downarrow X, R_B \uparrow X \rangle$ is called a rough set. The objects in $R_B \downarrow X$ mean that they can be with certainty classified as members of $X$ on the basis of knowledge in $B$, while the objects in $R_B \uparrow X$ can be only classified as possible members of $X$ on the basis of knowledge in $B$. In a decision system, for $X$ we use decision concepts (equivalence classes of decision attribute) $[x]_d$. We can define $B$-lower and $B$-upper approximations as follows. $R_B \downarrow [x]_d = { x \in U | [x]_{B} \subseteq [x]_d }$ $R_B \uparrow [x]_d = { x \in U | [x]_{B} \cap [x]_d \not= \emptyset }$ The positive, negative and boundary of $B$ regions can be defined as: $POS_{B} = \bigcup_{x \in U } R_B \downarrow [x]_d$ The boundary region, $BND_{B}$, is the set of objects that can possibly, but not certainly, be classified. $BND_{B} = \bigcup_{x \in U} R_B \uparrow [x]_d - \bigcup_{x \in U} R_B \downarrow [x]_d$ Furthermore, we can calculate the degree of dependency of the decision on a set of attributes. The decision attribute $d$ depends totally on a set of attributes $B$, denoted $B \Rightarrow d$, if all attribute values from $d$ are uniquely determined by values of attributes from $B$. It can be defined as follows. For $B \subseteq A$, it is said that $d$ depends on $B$ in a degree of dependency $\gamma_{B} = \frac{|POS_{B}|}{|U|}$. A decision reduct is a set $B \subseteq A$ such that $\gamma_{B} = \gamma_{A}$ and $\gamma_{B'} < \gamma_{B}$ for every $B' \subset B$. One algorithm to determine all reducts is by constructing the decision-relative discernibility matrix. The discernibility matrix $M(\mathcal{A})$ is an $n \times n$ matrix $(c_{ij})$ where $c_{ij} = {a \in A: a(x_i) \neq a(x_j) }$ if $d(x_i) \neq d(x_j)$ and $c_{ij} = \oslash$ otherwise The discernibility function $f_{\mathcal{A}}$ for a decision system $\mathcal{A}$ is a boolean function of $m$ boolean variables $\bar{a}_1, \ldots, \bar{a}_m$ corresponding to the attributes $a_1, \ldots, a_m$ respectively, and defined by $f_{\mathcal{A}}(\bar{a_1}, \ldots, \bar{a_m}) = \wedge {\vee \bar{c}_{ij}: 1 \le j < i \le n, c_{ij} \neq \oslash }$ where $\bar{c}_{ij}= { \bar{a}: a \in c_{ij}}$. The decision reducts of $A$ are then the prime implicants of the function $f_{\mathcal{A}}$. The complete explanation of the algorithm can be seen in (Skowron and Rauszer, 1992). The implementations of the RST concepts can be seen in BC.IND.relation.RST, BC.LU.approximation.RST, BC.positive.reg.RST, and BC.discernibility.mat.RST.

References

A. Skowron and C. Rauszer, "The Discernibility Matrices and Functions in Information Systems", in: R. Slowinski (Ed.), Intelligent Decision Support: Handbook of Applications and Advances of Rough Sets Theory, Kluwer Academic Publishers, Dordrecht, Netherland, p. 331 - 362 (1992). Z. Pawlak, "Rough Sets", International Journal of Computer and Information System, vol. 11, no.5, p. 341 - 356 (1982). Z. Pawlak, "Rough Sets: Theoretical Aspects of Reasoning about Data, System Theory, Knowledge Engineering and Problem Solving", vol. 9, Kluwer Academic Publishers, Dordrecht, Netherlands (1991).