Compute the incidence of events
incidence(
x,
date_index,
groups = NULL,
interval = 1L,
first_date = NULL,
last_date = NULL,
na_as_group = TRUE,
standard = TRUE,
count = NULL
)
A tibble or a data frame (see Note) representing a linelist.
The time index of the given data. This should be the name,
with or without quotation, corresponding to a date column in x of the
class: integer, numeric, Date, POSIXct, POSIXlt, and character. (See Note
about numeric
and character
formats)
An optional vector giving the names of the groups of observations for which incidence should be grouped. This can be given with or without quotation.`
An integer or character indicating the (fixed) size of the time interval used for computing the incidence; defaults to 1 day. This can also be a text string that corresponds to a valid date interval: day, week, month, quarter, or year. (See Note).
optional first/last dates to be used. When
these are NULL
(default), the dates from the first/last dates are taken
from the observations. If these dates are provided, the observations will
be trimmed to the range of [first_date, last_date].
A logical value indicating if missing group values (NA)
should treated as a separate category (TRUE
) or removed from
consideration (FALSE
).
(Only applicable where date_index references a Date object)
When TRUE
(default) and the interval
one of "week", "month", "quarter",
or "year", then this will cause the bins for the counts to start at the
beginning of the interval (See Note).
The count variable of the given data. If NULL (default) the data is taken to be a linelist of individual observations.
An incidence2 object. This is a subclass of tibble that represents and aggregated count of observations grouped according to the specified interval and, optionally, the given groups. By default it will contain the following columns:
bin_date: The dates marking the left side of the bins used for
counting events. When standard = TRUE
and the interval represents weeks,
months, quarters, or years, the first date will represent the first
standard date (See Interval specification, below).
-groups-: If specified, column(s) containing the categories of the given groups.
count: The aggregated observation count.
If a "week" interval is specified then the object may also contain additional columns:
weeks: Dates in week format (YYYY-Www), where YYYY corresponds to the
year of the given week and ww represents the numeric week of the year.
This will be a produced from the function aweek::date2week()
. Note that
these will have a special "week_start"
attribute indicating which day of
the ISO week the week starts on (see Weeks, below).
# NOT RUN {
if (requireNamespace("outbreaks", quietly = TRUE)) {
withAutoprint({
data(ebola_sim_clean, package = "outbreaks")
dat <- ebola_sim_clean$linelist
# daily incidence
dat %>%
incidence(date_of_onset)
# weekly incidence
dat %>%
incidence(date_of_onset, interval = "week", standard = FALSE)
# starting on a Monday
dat %>%
incidence(date_of_onset, interval = "isoweek")
# starting on a Sunday
dat %>%
incidence(date_of_onset, interval = "epiweek")
# starting on a Saturday
dat %>%
incidence(date_of_onset, interval = "saturday epiweek")
# group by gender
dat %>%
incidence(date_of_onset, interval = 7, groups = gender)
# group by gender and hospital
dat %>%
incidence(date_of_onset,
interval = "2 weeks",
groups = c(gender, hospital))
})
}
# use of first_date
dat <- data.frame(dates = Sys.Date() + sample(-3:10, 10, replace = TRUE))
dat %>% incidence(dates,
interval = "week",
first_date = Sys.Date() + 1)
# }
Run the code above in your browser using DataLab