rows_per_date: Find active rows per date

Description

Fast overlap joins. Usually, df is a very large data.table (e.g. insurance portfolio) with small interval ranges, and dates is much smaller with (e.g.) claim dates.

Usage

rows_per_date(
  df,
  dates,
  df_begin,
  df_end,
  dates_date,
  ...,
  nomatch = NULL,
  mult = "all"
)

Value

returned class is equal to class of df

Arguments

df: data.frame with portfolio (df should include time period)
dates: data.frame with dates to join
df_begin: column name with begin dates of time period in df
df_end: column name with end dates of time period in df
dates_date: column name with dates in dates
...: additional column names in dates to join by
nomatch: When a row (with interval say, [a,b]) in x has no match in y, nomatch=NA means NA is returned for y's non-by.y columns for that row of x. nomatch=NULL (default) means no rows will be returned for that row of x.
mult: When multiple rows in y match to the row in x, mult controls which values are returned - "all" (default), "first" or "last".

Author

Martin Haringa

Examples

Run this code

library(lubridate)
portfolio <- data.frame(
begin1 = ymd(c("2014-01-01", "2014-01-01")),
end = ymd(c("2014-03-14", "2014-05-10")),
termination = ymd(c("2014-03-14", "2014-05-10")),
exposure = c(0.2025, 0.3583),
premium =  c(125, 150),
car_type = c("BMW", "TESLA"))

## Find active rows on different dates
dates0 <- data.frame(active_date = seq(ymd("2014-01-01"), ymd("2014-05-01"),
by = "months"))
rows_per_date(portfolio, dates0, df_begin = begin1, df_end = end,
dates_date = active_date)

## With extra identifiers (merge claim date with time interval in portfolio)
claim_dates <- data.frame(claim_date = ymd("2014-01-01"),
car_type = c("BMW", "VOLVO"))

### Only rows are returned that can be matched
rows_per_date(portfolio, claim_dates, df_begin = begin1,
   df_end = end, dates_date = claim_date, car_type)

### When row cannot be matched, NA is returned for that row
rows_per_date(portfolio, claim_dates, df_begin = begin1,
   df_end = end, dates_date = claim_date, car_type, nomatch = NA)

Run the code above in your browser using DataLab