Learn R Programming

diyar (version 0.4.0)

episodes_wf_splits: Track episodes in a reduced dataset.

Description

Excludes duplicate records from the same day or period prior before passing the analysis to episodes. Only duplicate records that will not affect the case definition are excluded. The resulting episode identifiers are recycled for the duplicate records.

Usage

episodes_wf_splits(..., duplicates_recovered = "ANY", reframe = FALSE)

Arguments

...

Arguments passed to episodes.

duplicates_recovered

[character]. Determines which duplicate records are recycled. Options are "ANY" (default), "without_sub_criteria" or "with_sub_criteria". See Details.

reframe

[logical]. Determines if the duplicate records in a sub_criteria are reframed (TRUE) or excluded (FALSE).

Value

epid; list

Details

episodes_wf_splits() is a wrapper function of episodes() which reduces or re-frames the dataset to the minimum number of records required to implement a case definition. This leads to the same outcome but with the benefit of a shorter processing time.

Duplicate records from the same point or period in time are excluded from episodes(). The resulting epid object is then recycled for the duplicates.

The duplicates_recovered argument determines which identifiers are recycled. If "without_sub_criteria" is selected, only identifiers created from a matched sub_criteria ("Case_CR" and "Recurrent_CR") are recycled. The opposite ("Case" and "Recurrent") is the case if "with_sub_criteria" is selected. Excluded duplicates of "Duplicate_C" and "Duplicate_R" are always recycled.

The reframe argument will either reframe or subset a sub_criteria. Both will require slightly different functions for match_funcs or equal_funcs.

See Also

episodes; sub_criteria

Examples

Run this code
# NOT RUN {
# With 10,000 duplicate records of 20 events,
# `episodes_wf_splits()` will take less time than `episodes()`
dates <- seq(from = as.Date("2019-04-01"), to = as.Date("2019-04-20"), by = 1)
dates <- rep(dates, 10000)

system.time(
  ep1 <- episodes(dates, 1)
)
system.time(
  ep2 <- episodes_wf_splits(dates, 1)
)

# Both leads to the same outcome.
all(ep1 == ep2)
# }

Run the code above in your browser using DataLab