A.Introduction-RoughSets: Introduction to Rough Set Theory
Description
This part attempts to introduce rough set theory (RST) and
its application to data analysis. While the classical RST
proposed by Pawlak in 1982 is explained in detail in this
section, some recent advancements will be treated in the
documentation of the related functions.Details
In RST, a data set is represented as a table called an
information system $\mathcal{A} = (U, A)$, where
$U$ is a non-empty set of finite objects known as the
universe of discourse (note: it refers to all
instances/rows in datasets) and $A$ is a non-empty
finite set of attributes, such that $a : U \to V_{a}$
for every $a \in A$. The set $V_{a}$ is the set of
values that attribute $a$ may take. Information systems
that involve a decision attribute, containing classes for
each object, are called decision systems or decision
tables. More formally, it is a pair $\mathcal{A} = (U,
A \cup {d})$, where $d \notin A$ is the decision
attribute. The elements of $A$ are called conditional
attributes. The information system representing all data in
a particular system may contain redundant parts. It could
happen because there are the same or indiscernible objects
or some superfluous attributes. The indiscernibility
relation is a binary relation showing the relation between
two objects. This relation is an equivalence relation. Let
$\mathcal{A} = (U, A)$ be an information system, then
for any $B \subseteq A$ there is an equivalence
relation $R_B(x,y)$:
$R_B(x,y)= {(x,y) \in U^2 | \forall a \in B, a(x) =
a(y)}$
If $(x,y) \in R_B(x,y)$, then $x$ and $y$ are
indiscernible by attributes from $B$. The equivalence
classes of the $B$-indiscernibility relation are
denoted $[x]_{B}$. The indiscernibility relation will
be further used to define basic concepts of rough set
theory which are lower and upper approximations.
Let $B \subseteq A$ and $X \subseteq U$, $X$
can be approximated using the information contained within
$B$ by constructing the $B$-lower and $B$-upper
approximations of $X$:
$R_B \downarrow X = { x \in U | [x]_{B} \subseteq X
}$
$R_B \uparrow X = { x \in U | [x]_{B} \cap X \not=
\emptyset }$
The tuple $\langle R_B \downarrow X, R_B \uparrow X
\rangle$ is called a rough set. The objects in $R_B
\downarrow X$ mean that they can be with certainty
classified as members of $X$ on the basis of knowledge
in $B$, while the objects in $R_B \uparrow X$ can
be only classified as possible members of $X$ on the
basis of knowledge in $B$.
In a decision system, for $X$ we use decision concepts
(equivalence classes of decision attribute) $[x]_d$. We
can define $B$-lower and $B$-upper approximations
as follows.
$R_B \downarrow [x]_d = { x \in U | [x]_{B} \subseteq
[x]_d }$
$R_B \uparrow [x]_d = { x \in U | [x]_{B} \cap [x]_d
\not= \emptyset }$
The positive, negative and boundary of $B$ regions can
be defined as:
$POS_{B} = \bigcup_{x \in U } R_B \downarrow [x]_d$
The boundary region, $BND_{B}$, is the set of objects
that can possibly, but not certainly, be classified.
$BND_{B} = \bigcup_{x \in U} R_B \uparrow [x]_d -
\bigcup_{x \in U} R_B \downarrow [x]_d$
Furthermore, we can calculate the degree of dependency of
the decision on a set of attributes. The decision attribute
$d$ depends totally on a set of attributes $B$,
denoted $B \Rightarrow d$, if all attribute values from
$d$ are uniquely determined by values of attributes
from $B$. It can be defined as follows. For $B
\subseteq A$, it is said that $d$ depends on $B$ in
a degree of dependency $\gamma_{B} =
\frac{|POS_{B}|}{|U|}$.
A decision reduct is a set $B \subseteq A$ such that
$\gamma_{B} = \gamma_{A}$ and $\gamma_{B'} <
\gamma_{B}$ for every $B' \subset B$. One algorithm to
determine all reducts is by constructing the
decision-relative discernibility matrix. The discernibility
matrix $M(\mathcal{A})$ is an $n \times n$ matrix
$(c_{ij})$ where
$c_{ij} = {a \in A: a(x_i) \neq a(x_j) }$ if
$d(x_i) \neq d(x_j)$ and
$c_{ij} = \oslash$ otherwise
The discernibility function $f_{\mathcal{A}}$ for a
decision system $\mathcal{A}$ is a boolean function of
$m$ boolean variables $\bar{a}_1, \ldots,
\bar{a}_m$ corresponding to the attributes $a_1,
\ldots, a_m$ respectively, and defined by
$f_{\mathcal{A}}(\bar{a_1}, \ldots, \bar{a_m}) = \wedge
{\vee \bar{c}_{ij}: 1 \le j < i \le n, c_{ij} \neq \oslash
}$
where $\bar{c}_{ij}= { \bar{a}: a \in c_{ij}}$. The
decision reducts of $A$ are then the prime implicants
of the function $f_{\mathcal{A}}$. The complete
explanation of the algorithm can be seen in (Skowron and
Rauszer, 1992).
The implementations of the RST concepts can be seen in
BC.IND.relation.RST
,
BC.LU.approximation.RST
,
BC.positive.reg.RST
, and
BC.discernibility.mat.RST
.References
A. Skowron and C. Rauszer, "The Discernibility Matrices and
Functions in Information Systems", in: R. Slowinski (Ed.),
Intelligent Decision Support: Handbook of Applications and
Advances of Rough Sets Theory, Kluwer Academic Publishers,
Dordrecht, Netherland, p. 331 - 362 (1992).
Z. Pawlak, "Rough Sets", International Journal of Computer
and Information System, vol. 11, no.5, p. 341 - 356 (1982).
Z. Pawlak, "Rough Sets: Theoretical Aspects of Reasoning
about Data, System Theory, Knowledge Engineering and
Problem Solving", vol. 9, Kluwer Academic Publishers,
Dordrecht, Netherlands (1991).