Provides the generic functions is.significant()
and the method to
find significant rules.
is.significant(x, ...)# S4 method for rules
is.significant(
x,
transactions = NULL,
method = "fisher",
alpha = 0.01,
adjust = "none",
reuse = TRUE,
...
)
returns a logical vector indicating which rules are significant.
a set of rules.
further arguments are passed on to interestMeasure()
.
optional set of transactions. Only needed if not sufficient
interest measures are available in x
. If the test should be performed
on a transaction set different then the one used for mining (use reuse = FALSE
).
test to use. Options are "fisher"
, "chisq"
. Note that
the contingency table is likely to have cells with low expected values and
that thus Fisher's Exact Test might be more appropriate than the chi-squared
test.
required significance level.
method to adjust for multiple comparisons. Some options are
"none"
, "bonferroni"
, "holm"
, "fdr"
, etc. (see
stats::p.adjust()
for more methods)
logical indicating if information in the quality slot should be reuse for calculating the measures.
Michael Hahsler
The implementation for association rules uses Fisher's exact test with correction for multiple comparisons to test the null hypothesis that the LHS and the RHS of the rule are independent. Significant rules have a p-value less then the specified significance level alpha (the null hypothesis of independence is rejected). See Hahsler and Hornik (2007) for details.
Hahsler, Michael and Kurt Hornik (2007). New probabilistic interest measures for association rules. Intelligent Data Analysis, 11(5):437--455. tools:::Rd_expr_doi("10.3233/IDA-2007-11502")
Other interest measures:
confint()
,
coverage()
,
interestMeasure()
,
is.redundant()
,
support()
Other postprocessing:
is.closed()
,
is.generator()
,
is.maximal()
,
is.redundant()
,
is.superset()
Other associations functions:
abbreviate()
,
associations-class
,
c()
,
duplicated()
,
extract
,
inspect()
,
is.closed()
,
is.generator()
,
is.maximal()
,
is.redundant()
,
is.superset()
,
itemsets-class
,
match()
,
rules-class
,
sample()
,
sets
,
size()
,
sort()
,
unique()
data("Income")
rules <- apriori(Income, support = 0.2)
is.significant(rules)
rules[is.significant(rules)]
# Adjust P-values for multiple comparisons
rules[is.significant(rules, adjust = "bonferroni")]
Run the code above in your browser using DataLab