- x
A matrix or data frame. If a matrix, it must be numeric (double) or
logical. If a data frame, all columns must be numeric (double) or logical.
- f
A callback function executed for each generated condition. It may
declare any subset of the arguments listed below. The algorithm detects
which arguments are present and provides only those values to f
. This
design allows the user to control both the amount of information received
and the computational cost, as some arguments are more expensive to
compute than others. The function f
is expected to return an object
(typically a list) representing a pattern or patterns related to the
condition. The results of all calls of f
are collected and returned as
a list. Possible arguments are: condition
, sum
, support
, indices
,
weights
, pp
, pn
, np
, nn
, or foci_supports
(deprecated), which
are thoroughly described below in the "Details" section.
- condition
tidyselect expression (see
tidyselect syntax)
specifying columns of x
to use as condition predicates
- focus
tidyselect expression (see
tidyselect syntax)
specifying columns of x
to use as focus predicates
- disjoint
An atomic vector (length = number of columns in x
) defining
groups of predicates. Columns in the same group cannot appear together in
a condition. With data from partition()
, use var_names()
on column
names to construct disjoint
.
- excluded
NULL
or a list of character vectors, each representing an
implication formula. In each vector, all but the last element form the
antecedent and the last element is the consequent. These formulae are
treated as tautologies and used to filter out generated conditions. If
a condition contains both the antecedent and the consequent of any such
formula, it is not passed to the callback function f
. Likewise, if a
condition contains the antecedent, the corresponding focus (the consequent)
is not passed to f
.
- min_length
Minimum number of predicates in a condition required to
trigger the callback f
. Must be \(\ge 0\). If set to 0, the empty
condition also triggers the callback.
- max_length
Maximum number of predicates allowed in a condition.
Conditions longer than max_length
are not generated. If Inf
, the only
limit is the total number of available predicates. Must be \(\ge 0\) and
\(\ge min_length\). This setting strongly influences both the number of
generated conditions and the speed of the search.
- min_support
Minimum support of a condition required to trigger f
.
Support is the relative frequency of the condition in x
. For logical
data, this is the proportion of rows where all condition predicates are
TRUE
. For numeric (double) data, support is the mean (over all rows) of
the products of predicate values. Must be in \([0,1]\). If a condition’s
support falls below min_support
, recursive generation of its extensions
is stopped. Thus, min_support
directly affects search speed and the
number of callback calls.
- min_focus_support
Minimum support of a focus required for it to be
passed to f
. For logical data, this is the proportion of rows where both
the condition and the focus are TRUE
. For numeric (double) data, support
is computed as the mean (over all rows) of a t-norm of predicate values
(the t-norm is selected by t_norm
). Must be in \([0,1]\). Foci with
support below this threshold are excluded. Together with
filter_empty_foci
, this parameter influences both search speed and the
number of triggered calls of f
.
- min_conditional_focus_support
Minimum conditional support of a focus
within a condition. Defined as the relative frequency of rows where the
focus is TRUE
among those where the condition is TRUE
. If \(sum\)
(see support
in Details) is the number of rows (or sum of truth
degrees for fuzzy data) satisfying the condition, and \(pp\) (see
pp[i]
in Details) is the sum of truth degrees where both the condition
and the focus hold, then conditional support is \(pp / sum\). Must be in
\([0,1]\). Foci below this threshold are not passed to f
. Together with
filter_empty_foci
, this parameter influences search speed and the number
of callback calls.
- max_support
Maximum support of a condition to trigger f
. Conditions
with support above this threshold are skipped, but recursive generation of
their supersets continues. Must be in \([0,1]\).
- filter_empty_foci
Logical; controls whether f
is triggered for
conditions with no remaining foci after filtering by min_focus_support
or min_conditional_focus_support
. If TRUE
, f
is called only when at
least one focus remains. If FALSE
, f
is called regardless.
- t_norm
T-norm used for conjunction of weights: "goedel"
(minimum),
"goguen"
(product), or "lukas"
(Łukasiewicz).
- max_results
Maximum number of results (objects returned by the
callback f
) to store and return in the output list. When this limit
is reached, generation of further conditions stops. Use a positive
integer to enable early stopping; set to Inf
to remove the cap.
- verbose
Logical; if TRUE
, print progress messages.
- threads
Number of threads for parallel computation.
- error_context
A list of details to be used when constructing error
messages. This is mainly useful when dig()
is called from another
function and errors should refer to the caller’s argument names rather
than those of dig()
. The list must contain:
arg_x
– name of the argument x
as a character string
arg_f
– name of the argument f
as a character string
arg_condition
– name of the argument condition
arg_focus
– name of the argument focus
arg_disjoint
– name of the argument disjoint
arg_excluded
– name of the argument excluded
arg_min_length
– name of the argument min_length
arg_max_length
– name of the argument max_length
arg_min_support
– name of the argument min_support
arg_min_focus_support
– name of the argument
min_focus_support
arg_min_conditional_focus_support
– name of the argument
min_conditional_focus_support
arg_max_support
– name of the argument max_support
arg_filter_empty_foci
– name of the argument filter_empty_foci
arg_t_norm
– name of the argument t_norm
arg_threads
– name of the argument threads
call
– environment in which to evaluate error messages