epidata: Continuous-Time SIR Event History of a Fixed Population

Description

The function as.epidata is used to generate objects of class "epidata". Objects of this class are specific data frames containing the event history of an epidemic together with some additional attributes. These objects are the basis for fitting spatio-temporal epidemic intensity models with the function twinSIR. Their implementation is illustrated in Meyer et al. (2016, Section 4), see vignette("twinSIR"). Note that the spatial information itself, i.e. the positions of the individuals, is assumed to be constant over time. Besides epidemics following the SIR compartmental model, also data from SI, SIRS and SIS epidemics may be supplied. Inference for the infectious process works as usual and simulation of such epidemics is also possible.

Usage

as.epidata(data, ...)
## S3 method for class 'data.frame':
as.epidata(data, t0,
           tE.col, tI.col, tR.col, id.col, coords.cols,
           f = list(), w = list(), D = dist, keep.cols = TRUE, ...)
## S3 method for class 'default':
as.epidata(data, id.col, start.col, stop.col,
           atRiskY.col, event.col, Revent.col, coords.cols,
           f = list(), w = list(), D = dist, ...)
## S3 method for class 'epidata':
print(x, ...)
## S3 method for class 'epidata':
[(x, i, j, drop)
## S3 method for class 'epidata':
update(object, f = list(), w = list(), D = dist, ...)

Arguments

data

For the data.frame-method, a data frame with as many rows as there are individuals in the population and time columns indicating when each individual became exposed (optional), infectious (mandatory, but can be NA for

start time of the observation period. Will be subtracted from the time columns tE.col, tI.col, tR.col. Individuals that have already been removed prior to t0, i.e., rows with tR

tE.col, tI.col, tR.col

single numeric or character indexes of the time columns in data, which specify when the individuals became exposed, infectious and removed, respectively. tE.col and tR.col can be missing, corresponding to

id.col

single numeric or character index of the id column in data. The id column identifies the individuals in the data frame. It is converted to a factor by calling factor

start.col

single index of the start column in data.  Can be numeric
    (by column number) or character (by column name).
    The start column contains the (numeric) time points of the beginnings
    of the consecutive time in

stop.col

single index of the stop column in data.  Can be numeric
    (by column number) or character (by column name).
    The stop column contains the (numeric) time points of the ends
    of the consecutive time intervals

atRiskY.col

single index of the atRiskY column in data.  Can be numeric
    (by column number) or character (by column name).
    The atRiskY column indicates if the individual was at-risk
    of becoming infect

event.col

single index of the event column in data.  Can be numeric
    (by column number) or character (by column name).
    The event column indicates if the individual became infected
    at the stop t

Revent.col

single index of the Revent column in data.  Can be numeric
    (by column number) or character (by column name).
    The Revent column indicates if the individual was recovered 
    at the stop

coords.cols

indexes of the coords columns in data. Can be
    numeric (by column number), character (by column name), or NULL
    (no coordinates, e.g., if D is a pre-specified distance matrix).

f

a named list of vectorized functions for a
    distance-based force of infection.
    The functions must interact elementwise on a (distance) matrix D so that
    f[[m]](D) results in a matrix.  A simple example

w

a named list of vectorized functions for extra 
    covariate-based weights $w_{ij}$ in the epidemic component.
    Each function operates on a single time-constant covariate in
    data, which is determined by the name of t

D

either a function to calculate the distances between the individuals
    with locations taken from coord.cols (the default is
    Euclidean distance via the function dist) and
    the result conve

keep.cols

logical indicating if all columns in data
    should be retained (and not only the obligatory "epidata"
    columns), in particular any additional columns with 
    time-constant individual-specific covariates.
    Alternatively,

x,object

an object of class "epidata".

...

arguments passed to print.data.frame. Currently unused
    in the as.epidata-methods.

i,j,drop

arguments passed to [.data.frame.

`Value`

a data.frame with the columns "BLOCK", "id",
  "start", "stop", "atRiskY", "event",
  "Revent" and the coordinate columns (with the original names from
  data), which are all obligatory.  These columns are followed by any 
  remaining columns of the input data.  Last but not least, the newly
  generated columns with epidemic variables corresponding to the functions
  in the list f are appended, if length(f) > 0.
  
  The data.frame is given the additional attributes
"eventTimes"numeric vector of infection time points (sorted chronologically).
"timeRange"numeric vector of length 2: c(min(start), max(stop)).
"coords.cols"numeric vector containing the column indices of the coordinate columns in
    the resulting data frame.
"f"this equals the argument f.
"w"this equals the argument w.

`Details`

The print method for objects of class "epidata" simply prints
  the data frame with a small header containing the time range of the observed
  epidemic and the number of infected individuals.  Usually, the data frames
  are quite long, so the summary method summary.epidata might be
  useful.  Also, indexing/subsetting "epidata" works exactly as for
  data.frames, but there is an own method, which
  assures consistency of the resulting "epidata" or drops this class, if
  necessary.
  The update-method can be used to add or replace distance-based
  (f) or covariate-based (w) epidemic variables in an
  existing "epidata" object.
  
  SIS epidemics are implemented as SIRS epidemics where the length of the
  removal period equals 0.  This means that an individual, which has an R-event
  will be at risk immediately afterwards, i.e. in the following time block.
  Therefore, data of SIS epidemics have to be provided in that form containing
  pseudo-R-events.

`References`

Meyer, S., Held, L. and H�{oe}hle, M. (2016):
  Spatio-temporal analysis of epidemic phenomena using the Rpackage
  surveillance. Journal of Statistical Software. In press.
Preprint available at http://arxiv.org/abs/1411.0416

`See Also`

The hagelloch data for a real "epidata" object.
The code for the conversion from the simple data frame to the SIR event
history using as.epidata.data.frame is given in
example(hagelloch).
The plot and the
summary method for class "epidata".
Furthermore, the function animate.epidata for the animation of
epidemics.
Function twinSIR for fitting spatio-temporal epidemic intensity
models to epidemic data.
Function simEpidata for the simulation of epidemic data.

`Examples`

Run this code# see help("hagelloch") for an example with a real data set

# here is an artificial event history
data("foodata")
str(foodata)

# convert the data to an object of class "epidata",
# also generating some epidemic covariates
myEpidata <- as.epidata(foodata,
  id.col = 1, start.col = "start", stop.col = "stop",
  atRiskY.col = "atrisk", event.col = "infected", Revent.col = "removed",
  coords.cols = c("x","y"),
  f = list(B1 = function(u) u <= 1, B2 = function(u) u > 1))

# this is how data("fooepidata") has been generated
data("fooepidata")
stopifnot(all.equal(myEpidata, fooepidata))

# add covariate-based weight for the force of infection, e.g.,
# to model an increased force if i and j have the same value in z1
myEpidata2 <- update(fooepidata,
                     w = list(samez1 = function(z1.i, z1.j) z1.i == z1.j))

str(fooepidata)
subset(fooepidata, BLOCK == 1)

summary(fooepidata)          # see 'summary.epidata'
plot(fooepidata)             # see 'plot.epidata' and also 'animate.epidata'
stateplot(fooepidata, "15")  # see 'stateplot'
Run the code above in your browser using DataLab