Learn R Programming

epikit (version 0.2.0)

find_date_cause: Find the first date beyond a cutoff in several columns

Description

This function will find the first date in an ordered series of columns that falls within a specified period. If no dates from the provided columns fall within the period, it returns the period boundary (start or end) as a fallback.

Usage

find_date_cause(
  x,
  ...,
  period_start = NULL,
  period_end = NULL,
  datecol = "start_date",
  datereason = "start_date_reason",
  na_fill = "start"
)

find_start_date( x, ..., period_start = NULL, period_end = NULL, datecol = "start_date", datereason = "start_date_reason" )

find_end_date( x, ..., period_start = NULL, period_end = NULL, datecol = "end_date", datereason = "end_date_reason" )

constrain_dates(i, period_start, period_end, boundary = "both")

assert_positive_timespan(x, date_start, date_end)

Arguments

x

a data frame

...

an ordered series of date columns (i.e. the most important date to be considered first). Earlier columns take precedence in case of ties.

period_start, period_end

for the find_ functions, this should be the name of a column in x that contains the start/end of the recall period. For constrain_dates, this should be a vector of dates.

datecol

the name of the new column to contain the dates

datereason

the name of the column to contain the name of the column from which the date came.

na_fill

one of "start", "end", or NULL. If "start" or "end", NA values in the result will be replaced with the corresponding period boundary. If NULL, NAs are left as-is.

i

a vector of dates

boundary

one of "both", "start", or "end". Dates outside of the boundary will be set to NA.

date_start, date_end

column name of a date vector

Examples

Run this code
d <- data.frame(
  s1 = c(as.Date("2013-01-01") + 0:10, as.Date(c("2012-01-01", "2014-01-01"))),
  s2 = c(as.Date("2013-02-01") + 0:10, as.Date(c("2012-01-01", "2014-01-01"))),
  s3 = c(as.Date("2013-01-10") - 0:10, as.Date(c("2012-01-01", "2014-01-01"))),
  ps = as.Date("2012-12-31"),
  pe = as.Date("2013-01-09")
)
print(dd <- find_date_cause(d, s1, s2, s3, period_start = ps, period_end = pe))
print(bb <- find_date_cause(d, s1, s2, s3, period_start = ps, period_end = pe,
                            na_fill = "end",
                            datecol = "enddate",
                            datereason = "endcause"))
find_date_cause(d, s3, s2, s1, period_start = ps, period_end = pe)

# works
assert_positive_timespan(dd, start_date, pe)

# returns a warning because the last date isn't later than the start_date
assert_positive_timespan(dd, start_date, s2)


with(d, constrain_dates(s1, ps, pe))
with(d, constrain_dates(s2, ps, pe))
with(d, constrain_dates(s3, ps, pe))

Run the code above in your browser using DataLab