Filters the log based the frequency of traces, using an interval or a percentile cut off.
filter_trace_frequency(
log,
interval = NULL,
percentage = NULL,
reverse = FALSE,
eventlog = deprecated()
)# S3 method for log
filter_trace_frequency(
log,
interval = NULL,
percentage = NULL,
reverse = FALSE,
eventlog = deprecated()
)
# S3 method for grouped_log
filter_trace_frequency(
log,
interval = NULL,
percentage = NULL,
reverse = FALSE,
eventlog = deprecated()
)
When given an object of type log
, it will return a filtered log
.
When given an object of type grouped_log
, the filter will be applied in a stratified way (i.e. each separately for each group).
The returned log will be grouped on the same variables as the original log.
log
: Object of class log
or derivatives (grouped_log
, eventlog
, activitylog
, etc.).
The target coverage of activity instances. Provide either percentage
or interval
.
percentage
(numeric
): A percentile of p will select the most common traces of the log,
until at least p% of the cases is covered.
interval
(numeric
vector of length 2): A trace frequency interval. The filter will select cases
of which the trace has a frequency inside the interval. Half open interval can be created using NA
.
For more information, see 'Details' below.
logical
(default FALSE
): Indicating whether the selection should be reversed.
filter_trace_frequency(log)
: Filters cases for a log
.
filter_trace_frequency(grouped_log)
: Filters cases for a grouped_log
.
Filtering the log based on trace frequency can be done in two ways: using an interval
of allowed frequencies,
or specify a coverage percentage
:
percentage
: When filtering using a percentage p%, the filter will return p% of the cases, starting from the traces
with the highest frequency. The filter will retain additional traces as long as the number of activity instances does not exceed the percentage threshold.
interval
: When filtering using an interval, traces will be retained when their absolute frequency fall in this interval.
The interval is specified using a numeric vector of length 2. Half open intervals can be created by using NA
,
e.g., c(10, NA)
will select cases with a trace that occurs 10 times or more.
Swennen, M. (2018). Using Event Log Knowledge to Support Operational Exellence Techniques (Doctoral dissertation). Hasselt University.
Other filters:
filter_activity_frequency()
,
filter_activity_instance()
,
filter_activity_presence()
,
filter_activity()
,
filter_case_condition()
,
filter_case()
,
filter_endpoints_condition()
,
filter_endpoints()
,
filter_flow_time()
,
filter_idle_time()
,
filter_infrequent_flows()
,
filter_lifecycle_presence()
,
filter_lifecycle()
,
filter_precedence_condition()
,
filter_precedence_resource()
,
filter_precedence()
,
filter_processing_time()
,
filter_resource_frequency()
,
filter_resource()
,
filter_throughput_time()
,
filter_time_period()
,
filter_trace_length()
,
filter_trace()
,
filter_trim_lifecycle()
,
filter_trim()