Learn R Programming

⚠️There's a newer version (1.7-9) of this package.Take me there.

arules — Mining Association Rules and Frequent Itemsets with R

The arules package for R provides the infrastructure for representing, manipulating and analyzing transaction data and patterns using frequent itemsets and association rules. The package also provides a wide range of interest measures and mining algorithms including the code of Christian Borgelt’s popular and efficient C implementations of the association mining algorithms Apriori and Eclat. In addition, the following mining algorithms are available via fim4r:

  • Apriori
  • Eclat
  • Carpenter
  • FPgrowth
  • IsTa
  • RElim
  • SaM

Code examples can be found in Chapter 5 of the web book R Companion for Introduction to Data Mining.

arules core packages:

  • arules: arules base package with data structures, mining algorithms (APRIORI and ECLAT), interest measures.
  • arulesViz: Visualization of association rules.
  • arulesCBA: Classification algorithms based on association rules (includes CBA).
  • arulesSequences: Mining frequent sequences (cSPADE).

Other related packages:

Additional mining algorithms

  • arulesNBMiner: Mining NB-frequent itemsets and NB-precise rules.
  • fim4r: Provides fast implementations for several mining algorithms. An interface function called fim4r() is provided in arules.
  • opusminer: OPUS Miner algorithm for finding the op k productive, non-redundant itemsets. Call opus() with format = 'itemsets'.
  • RKEEL: Interface to KEEL’s association rule mining algorithm.
  • RSarules: Mining algorithm which randomly samples association rules with one pre-chosen item as the consequent from a transaction dataset.

In-database analytics

  • ibmdbR: IBM in-database analytics for R can calculate association rules from a database table.
  • rfml: Mine frequent itemsets or association rules using a MarkLogic server.

Interface

  • rattle: Provides a graphical user interface for association rule mining.
  • pmml: Generates PMML (predictive model markup language) for association rules.

Classification

  • arc: Alternative CBA implementation.
  • inTrees: Interpret Tree Ensembles provides functions for: extracting, measuring and pruning rules; selecting a compact rule set; summarizing rules into a learner.
  • rCBA: Alternative CBA implementation.
  • qCBA: Quantitative Classification by Association Rules.
  • sblr: Scalable Bayesian rule lists algorithm for classification.

Outlier Detection

Recommendation/Prediction

  • recommenerlab: Supports creating predictions using association rules.

Installation

Stable CRAN version: install from within R with

install.packages("arules")

Current development version: install from GitHub (needs devtools and Rtools for Windows).

devtools::install_github("mhahsler/arules")

Usage

Load package and mine some association rules.

library("arules")
data("IncomeESL")

trans <- transactions(IncomeESL)
trans
## transactions in sparse format with
##  8993 transactions (rows) and
##  84 items (columns)
rules <- apriori(trans, supp = 0.1, conf = 0.9, target = "rules")
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.9    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 899 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[84 item(s), 8993 transaction(s)] done [0.01s].
## sorting and recoding items ... [42 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 6 done [0.03s].
## writing ... [457 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

Inspect the rules with the highest lift.

inspect(head(rules, n = 3, by = "lift"))
##     lhs                           rhs                      support confidence coverage lift count
## [1] {dual incomes=no,                                                                            
##      householder status=own}   => {marital status=married}    0.10       0.97     0.10  2.6   914
## [2] {years in bay area=>10,                                                                      
##      dual incomes=yes,                                                                           
##      type of home=house}       => {marital status=married}    0.10       0.96     0.10  2.6   902
## [3] {dual incomes=yes,                                                                           
##      householder status=own,                                                                     
##      type of home=house,                                                                         
##      language in home=english} => {marital status=married}    0.11       0.96     0.11  2.6   988

Using arules with tidyverse

arules works seamlessly with tidyverse. For example:

  • dplyr can be used for cleaning and preparing the transactions.
  • transaction() and other functions accept tibble as input.
  • Functions in arules can be used with %>%.
  • arulesViz provides visualizations based on ggplot2.

For example, we can remove the ethnic information column before creating transactions and then mine and inspect rules.

library("tidyverse")
library("arules")
data("IncomeESL")

trans <- IncomeESL %>%
    select(-`ethnic classification`) %>%
    transactions()
rules <- trans %>%
    apriori(supp = 0.1, conf = 0.9, target = "rules", control = list(verbose = FALSE))
rules %>%
    head(n = 3, by = "lift") %>%
    inspect()
##     lhs                           rhs                      support confidence coverage lift count
## [1] {dual incomes=no,                                                                            
##      householder status=own}   => {marital status=married}    0.10       0.97     0.10  2.6   914
## [2] {years in bay area=>10,                                                                      
##      dual incomes=yes,                                                                           
##      type of home=house}       => {marital status=married}    0.10       0.96     0.10  2.6   902
## [3] {dual incomes=yes,                                                                           
##      householder status=own,                                                                     
##      type of home=house,                                                                         
##      language in home=english} => {marital status=married}    0.11       0.96     0.11  2.6   988

Using arules from Python

See Getting started with arules using Python.

Support

Please report bugs here on GitHub. Questions should be posted on stackoverflow and tagged with arules.

References

Copy Link

Version

Install

install.packages('arules')

Monthly Downloads

19,769

Version

1.7-3

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Hahsler

Last Published

January 9th, 2022

Functions in arules (1.7-3)

AScontrol-classes

Classes AScontrol, APcontrol, ECcontrol --- Specifying the control Argument of Apriori and Eclat
confint

Confidence Intervals for Interest Measures for Association Rules
c

Combining Association and Transaction Objects
associations-class

Class associations --- A Set of Associations
apriori

Mining Associations with the Apriori Algorithm
affinity

Computing Affinity Between Items
fim4r

Interface to Mining Algorithms from fim4r
extract

Methods for "[": Extraction or Subsetting arules Objects
eclat

Mining Associations with Eclat
duplicated

Find Duplicated Elements
addComplement

Add Complement-items to Transactions
SunBai

The SunBai Weighted Transactions Data Set
is.superset

Find Super and Subsets
abbreviate

Abbreviate item labels in transactions, itemMatrix and associations
is.generator

Find Generator Itemsets
is.maximal

Find Maximal Itemsets
interestMeasure

Calculate Additional Interest Measures
itemsets-class

Class itemsets --- A Set of Itemsets
dissimilarity

Dissimilarity Matrix Computation for Associations and Transactions
itemMatrix-class

Class itemMatrix --- Sparse Binary Incidence Matrix to Represent Sets of Items
discretize

Convert a Continuous Variable into a Categorical Variable
ruleInduction

Association Rule Induction from Itemsets
is.closed

Find Closed Itemsets
rules-class

Class rules --- A Set of Rules
match

Value Matching
itemwiseSetOps

Itemwise Set Operations
image

Visual Inspection of Binary Incidence Matrices
itemCoding

Item Coding --- Conversion between Item Labels and Column IDs
inspect

Display Associations and Transactions in Readable Form
itemFrequency

Getting Frequency/Support for Single Items
read

Read Transaction Data
size

Number of Items in Sets
random.transactions

Simulate a Random Transactions
coverage

Calculate coverage for rules
sort

Sort Associations
merge

Adding Items to Data
itemFrequencyPlot

Creating a Item Frequencies/Support Bar Plot
pmml

Read and Write PMML
subset

Subsetting Itemsets, Rules and Transactions
weclat

Mining Associations from Weighted Transaction Data with Eclat (WARM)
hierarchy

Support for Item Hierarchies
is.redundant

Find Redundant Rules
support

Support Counting for Itemsets
crossTable

Cross-tabulate joint occurrences across pairs of items
hits

Computing Transaction Weights With HITS
write

Write Transactions or Associations to a File
supportingTransactions

Supporting Transactions
proximity-classes

Classes dist, ar_cross_dissimilarity and ar_similarity --- Proximity Matrices
is.significant

Find Significant Rules
predict

Model Predictions
sample

Random Samples and Permutations
transactions-class

Class transactions --- Binary Incidence Matrix for Transactions
sets

Set Operations
unique

Remove Duplicated Elements from a Collection
tidLists-class

Class tidLists --- Transaction ID Lists for Items/Itemsets
DATAFRAME

Data.frame Representation for arules Objects
Groceries

The Groceries Transactions Data Set
Income

The Income Data Set
Adult

Adult Data Set
APappearance-class

Class APappearance --- Specifying the appearance Argument of Apriori to Implement Rule Templates
Mushroom

The Mushroom Data Set as Transactions
LIST

List Representation for Objects Based on Class itemMatrix
ASparameter-classes

Classes ASparameter, APparameter, ECparameter --- Specifying the parameter Argument of APRIORI and ECLAT
Epub

The Epub Transactions Data Set