Excludes duplicate records from the same day or period prior before passing the analysis to episodes
.
Only duplicate records that will not affect the case definition are excluded.
The resulting episode identifiers are recycled for the duplicate records.
episodes_wf_splits(..., duplicates_recovered = "ANY", reframe = FALSE)
epid
; list
Arguments passed to episodes
.
[character]
. Determines which duplicate records are recycled.
Options are "ANY"
(default), "without_sub_criteria"
, "with_sub_criteria"
or "ALL"
. See Details
.
[logical]
. Determines if the duplicate records in a sub_criteria
are reframed (TRUE
) or excluded (FALSE
).
episodes_wf_splits()
is a wrapper function of episodes()
which reduces or re-frames the dataset to
the minimum number of records required to implement a case definition.
This leads to the same outcome but with the benefit of a shorter processing time.
Duplicate records from the same point or period in time are excluded from episodes()
.
The resulting epid
object is then recycled for the duplicates.
The duplicates_recovered
argument determines which identifiers are recycled.
If "without_sub_criteria"
is selected, only identifiers created from a matched sub_criteria
("Case_CR"
and "Recurrent_CR"
) are recycled.
The opposite ("Case"
and "Recurrent"
) is the case if "with_sub_criteria"
is selected.
Excluded duplicates of "Duplicate_C"
and "Duplicate_R"
are always recycled.
The reframe
argument will either reframe
or subset a sub_criteria
.
Both will require slightly different functions for match_funcs
or equal_funcs
.
episodes
; sub_criteria
# With 10,000 duplicate records of 20 events,
# `episodes_wf_splits()` will take less time than `episodes()`
dates <- seq(from = as.Date("2019-04-01"), to = as.Date("2019-04-20"), by = 1)
dates <- rep(dates, 10000)
system.time(
ep1 <- episodes(dates, 1)
)
system.time(
ep2 <- episodes_wf_splits(dates, 1)
)
# Both leads to the same outcome.
all(ep1 == ep2)
Run the code above in your browser using DataLab