Learn R Programming

SDR (version 0.7.0.0)

MESDIF: Multiobjective Evolutionary Subgroup DIscovery Fuzzy rules (MESDIF) Algorithm

Description

Performs a subgroup discovery task executing the algorithm MESDIF

Usage

MESDIF(paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetVariable = NA, targetClass = "null")

Arguments

paramFile
The path of the parameters file. NULL If you want to use training and test keel variables
training
A keel class variable with training data.
test
A keel class variable with test data.
output
character vector with the paths where store information file, rules file and test quality measures file, respectively.
seed
An integer to set the seed used for generate random numbers.
nLabels
Number of fuzzy labels defined in the datasets.
nEval
An integer for set the maximum number of evaluations in the evolutive process.
popLength
An integer to set the number of individuals in the population.
eliteLength
An integer to set the number of individuals in the elite population.
crossProb
Sets the crossover probability. A number in [0,1].
mutProb
Sets the mutation probability. A number in [0,1].
RulesRep
Representation used in the rules. "can" for canonical rules, "dnf" for DNF rules.
Obj1
Sets the Objective number 1. See Objective values for more information about the possible values.
Obj2
Sets the Objective number 2. See Objective values for more information about the possible values.
Obj3
Sets the Objective number 3. See Objective values for more information about the possible values.
Obj4
Sets the Objective number 4. See Objective values for more information about the possible values.
targetVariable
The name or index position of the target variable (or class). It must be a categorical one.
targetClass
A string specifing the value the target variable. null for search for all possible values.

Value

The algorithm shows in the console the following results:
  1. The parameters used in the algorithm
  2. The rules generated.
  3. The quality measures for test of every rule and the global results.
Also, the algorithms save those results in the files specified in the output parameter of the algorithm or in the outputData parameter in the parameters file.

How does this algorithm work?

This algorithm performs a multi-objective genetic algorithm based on elitism (following the SPEA2 approach). The elite population has a fixed size and it is filled by non-dominated individuals. An individual is non-dominated when (! all(ObjI1 <= obji2)="" &="" any(obji1="" <="" obji2))<="" code=""> where ObjI1 is the objetive value for our individual and ObjI2 is the objetive value for another individual. The number of dominated individuals by each one determine, in addition with a niches technique that considers the proximity among values of the objectives a fitness value for the selection. The number of non-dominated individuals might be greater or less than elite population size and in those cases MESDIF implements a truncation operator and a fill operator respectively. Then, genetic operators are applied. At the final of the evolutive process it returns the rules stored in elite population.

Parameters file structure

The paramFile argument points to a file which has the necesary parameters for MESDIF works. This file must be, at least, those parameters (separated by a carriage return):
  • algorithm Specify the algorithm to execute. In this case. "MESDIF"
  • inputData Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory.
  • seed Sets the seed for the random number generator
  • nLabels Sets the number of fuzzy labels to create when reading the files
  • nEval Set the maximun number of evaluations of rules for stop the genetic process
  • popLength Sets number of individuals of the main population
  • eliteLength Sets number of individuals of the elite population. Must be less than popLength
  • crossProb Crossover probability of the genetic algorithm. Value in [0,1]
  • mutProb Mutation probability of the genetic algorithm. Value in [0,1]
  • Obj1 Sets the objetive number 1.
  • Obj2 Sets the objetive number 2.
  • Obj3 Sets the objetive number 3.
  • Obj4 Sets the objetive number 4.
  • RulesRep Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation.
  • targetClass Value of the target variable to search for subgroups. The target variable is always the last variable. Use null to search for every value of the target variable
An example of parameter file could be:
 algorithm = MESDIF
 inputData = "irisd-10-1tra.dat" "irisd-10-1tst.dat"
 outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt"
 seed = 0
 nLabels = 3
 nEval = 500
 popLength = 100
 eliteLength = 3
 crossProb = 0.6
 mutProb = 0.01
 RulesRep = can
 Obj1 = comp
 Obj2 = unus
 Obj3 = null
 Obj4 = null
 targetClass = Iris-setosa 

Objective values

You can use the following quality measures in the ObjX value of the parameter file using this values:
  • Unusualness -> unus
  • Crisp Support -> csup
  • Crisp Confidence -> ccnf
  • Fuzzy Support -> fsup
  • Fuzzy Confidence -> fcnf
  • Coverage -> cove
  • Significance -> sign
If you dont want to use a objective value you must specify null

Details

This function sets as target variable the last one that appear in the KEEL file. If you want to change the target variable, you can use changeTargetVariable for this objective. The target variable MUST be categorical, if it is not, throws an error.

If you specify in paramFile something distintc to NULL the rest of the parameters are ignored and the algorithm tries to read the file specified. See "Parameters file structure" below if you want to use a parameters file.

References

  • Berlanga, F., Del Jesus, M., Gonzalez, P., Herrera, F., & Mesonero, M. (2006). Multiobjective Evolutionary Induction of Subgroup Discovery Fuzzy Rules: A Case Study in Marketing.
  • Zitzler, E., Laumanns, M., & Thiele, L. (2001). SPEA2: Improving the Strength Pareto Evolutionary Algorithm.

Examples

Run this code
MESDIF( paramFile = NULL,
        training = habermanTra,
        test = habermanTst,
        output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
        seed = 0,
        nLabels = 3,
        nEval = 300,
        popLength = 100,
        eliteLength = 3,
        crossProb = 0.6,
        mutProb = 0.01,
        RulesRep = "can",
        Obj1 = "CSUP",
        Obj2 = "CCNF",
        Obj3 = "null",
        Obj4 = "null",
        targetClass = "positive"
        )

## Not run: 
# Execution for all classes, see 'targetClass' parameter
# MESDIF( paramFile = NULL,
#         training = habermanTra,
#         test = habermanTst,
#         output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
#         seed = 0,
#         nLabels = 3,
#         nEval = 300,
#         popLength = 100,
#         eliteLength = 3,
#         crossProb = 0.6,
#         mutProb = 0.01,
#         RulesRep = "can",
#         Obj1 = "CSUP",
#         Obj2 = "CCNF",
#         Obj3 = "null",
#         Obj4 = "null",
#         targetClass = "null"
#         )
#  ## End(Not run)

Run the code above in your browser using DataLab