ruleInduction: Rule Induction from Itemsets

Description

Provides the generic function and the needed S4 method to induce all rules which can be generated by the given set of itemsets from a transactions dataset. This method can be used to create closed association rules.

Usage

ruleInduction(x, ...)
# S4 method for itemsets
ruleInduction(x, transactions, confidence = 0.8, 
    control = NULL)

Arguments

the set of itemsets from which rules will be induced.

…

further arguments.

transactions

the transaction dataset used to mine the itemsets. Can be omitted if x contains a lattice (complete set) of frequent itemsets together with their support counts.

confidence

a numeric value giving the minimum confidence for the rules.

control

a named list with elements method indicating the method ("apriori" or "ptree"), and the logical arguments reduce and verbose to indicate if unused items are removed and if the output should be verbose. Currently, "ptree" is the default method.

Value

An object of class rules.

Details

If in control method = "apriori" is used, a very simple rule induction method is used. All rules are mined from the transactions data set using Apriori with the minimal support found in itemsets. And in a second step all rules which do not stem from one of the itemsets are removed. This procedure will be in many cases very slow (e.g., for itemsets with many elements or very low support).

If in control method = "ptree" is used, the transactions are counted into a prefix tree and then the rules are selectively generated using the counts in the tree. This is usually faster than the above approach.

If in control reduce = TRUE is used, unused items are removed from the data before creating rules. This might be slower for large transaction data sets. However, for method = "ptree" this is highly recommended as the items are further reordered to reduce the counting time.

If argument transactions is missing it is assumed that x contains a lattice (complete set) of frequent itemsets together with their support counts. Then rules can be induced directly without support counting. This approach is very fast.

For transactions, a set different to the data used for creating the original itemsets can be used, however, the new set has to conform in terms of items and their order.

This method can be used to produce closed association rules defined by Pei et al. (2000) as rules \(X -> Y\) where both \(X\) and \(Y\) are closed frequent itemsets. See Example section for code.

References

Michael Hahsler, Christian Buchta, and Kurt Hornik. Selective association rule generation. Computational Statistics, 23(2):303-315, April 2008.

Jian Pei, Jiawei Han, Runying Mao. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000).

Examples

Run this code

# NOT RUN {
data("Adult")

## find all closed frequent itemsets
closed_is <- apriori(Adult, 
	parameter = list(target = "closed frequent itemsets", support = 0.4))
closed_is

## use rule induction to produce all closed association rules
closed_rules <- ruleInduction(closed_is, Adult, 
  control = list(verbose = TRUE))

## X&Y are already closed, check that X is also closed
closed_rules[is.element(lhs(closed_rules), items(closed_is))]

## inspect the resulting closed rules
summary(closed_rules)
inspect(head(closed_rules, by = "lift"))

## use lattice of frequent itemsets
ec  <- eclat(Adult, parameter = list(support = 0.4))
rec <- ruleInduction(ec)
rec
inspect(head(rec))
# }

Run the code above in your browser using DataLab