Learn R Programming

popEpi (version 0.2.1)

splitMulti: Split case-level observations

Description

Split a Lexis object along multiple time scales with speed and ease

Usage

splitMulti(data, breaks = NULL, ..., drop = TRUE, merge = TRUE,
  verbose = FALSE)

Arguments

data
a Lexis object with event cases as rows
breaks
a list of named numeric vectors of breaks; see Details and Examples
...
alternate way of supplying breaks as named vectors; e.g. fot = 0:5 instead of breaks = list(fot = 0:5); if breaks is not NULL, breaks is used and any breaks passed through ...
drop
logical; if TRUE, drops all resulting rows after expansion that reside outside the time window defined by the given breaks
merge
logical; if TRUE, retains all variables from the original data - i.e. original variables are repeated for all the rows by original subject
verbose
logical; if TRUE, the function is chatty and returns some messages along the way

Value

  • A data.table or data.frame (depending on options("popEpi.datatable"); see ?popEpi) object expanded to accommodate split observations.

Details

splitMulti is in essence a data.table version of splitLexis or survSplit for splitting along multiple time scales. It requires a Lexis object as input. The breaks must be a list of named numeric vectors. The breaks are fully explicit and left-inclusive and right exclusive, e.g. fot=c(0,5) forces the data to only include time between [0,5) for each original row (unless drop = FALSE). Use Inf or -Inf for open-ended intervals, e.g. per=c(1990,1995,Inf) creates the intervals [1990,1995), [1995, Inf). Instead of specifying breaks, one may make use of the ... argument to pass breaks: e.g. splitMulti(x, breaks = list(fot = 0:5)) is equivalent to splitMulti(x, fot = 0:5). Multiple breaks can be supplied in the same manner. However, if both breaks and ... are used, only the breaks in breaks are utilized within the function.

See Also

splitLexis, Lexis, survSplit, splitLexisDT

Examples

Run this code
#### let's prepare data for computing period method survivals
#### in case there are problems with dates, we first
#### convert to fractional years.
library(Epi)
x <- Lexis(data=sire, entry = list(fot=0, per=get.yrs(dg_date), age=dg_age),
           exit=list(per=get.yrs(ex_date)), exit.status=status)
x2 <- splitMulti(x, breaks = list(fot=seq(0, 5, by = 3/12), per=c(2008, 2013)))
# equivalently:
x2 <- splitMulti(x, fot=seq(0, 5, by = 3/12), per=c(2008, 2013))

## using dates; note: breaks must be expressed as dates or days!
x <- Lexis(data=sire, entry = list(fot=0, per=dg_date, age=dg_date-bi_date),
           exit=list(per=ex_date), exit.status=status)
BL <- list(fot = seq(0, 5, by = 3/12)*365.242199,
           per = as.IDate(paste0(c(1980:2014),"-01-01")),
           age = c(0,45,85,Inf)*365.242199)
x2 <- splitMulti(x, breaks = BL, verbose=TRUE)

## multistate (healty - sick - dead)
## pretend some observation never got cancer
set.seed(1L)

sire2 <- copy(sire)
sire2$status <- factor(sire2$status, levels = 0:2)
levels(sire2$status) <- c("healthy", "dead", "dead")

not_sick <- sample.int(nrow(sire2), 6000L, replace = FALSE)
sire2[not_sick, ]$dg_date <- NA
sire2[!is.na(dg_date) & status == "healthy", ]$status <- "sick"

xm <- Lexis(data=sire2, entry = list(fot=0, per=get.yrs(bi_date), age=0),
            exit=list(per=get.yrs(ex_date)), exit.status=status)
xm2 <- cutLexis(xm, cut = get.yrs(xm$dg_date), timescale = "per", new.state = "sick")
xm2[xm2$lex.id == 6L, ]

xm2 <- splitMulti(xm2, breaks = list(fot = seq(0,150,25)))
xm2[xm2$lex.id == 6L, ]

Run the code above in your browser using DataLab