fit: Fit exponential models to incidence data

Description

The function fit fits two exponential models to incidence data, of the form: $log(y) = r * t + b$ where 'y' is the incidence, 't' is time (in days), 'r' is the growth rate, and 'b' is the origin. The function fit will fit one model by default, but will fit two models on either side of a splitting date (typically the peak of the epidemic) if the argument split is provided. The function fit_optim_split can be used to find the optimal 'splitting' date, defined as the one for which the best average R2 of the two models is obtained.

Usage

fit(x, split = NULL, level = 0.95, quiet = FALSE)
fit_optim_split(x, window = x$timespan/4, plot = TRUE, quiet = TRUE)
# S3 method for incidence_fit
print(x, ...)
# S3 method for incidence_fit
plot(x, ..., col_pal = pal1)

Arguments

An incidence object, generated by the function incidence. For the plotting function, an incidence_fit object.

split

An optional time point identifying the separation between the two models. If NULL, a single model is fitted. If provided, two models would be fitted on the time periods on either side of the split.

level

The confidence interval to be used for predictions; defaults to 95%.

quiet

A logical indicating if warnings from fit should be hidden; FALSE by default. Warnings typically indicate some zero incidence, which are removed before performing the log-linear regression.

window

The size, in days, of the time window either side of the split.

plot

A logical indicating whether a plot should be added to the output, showing the mean R2 for various splits.

...

further arguments passed to other methods (not used)

col_pal

The color palette to be used for the groups; defaults to pal1. See pal1 for other palettes implemented in incidence.

Value

For fit, a list with the class incidence_fit (for a single model), or a list containing two incidence_fit objects (when fitting two models). incidence_fit objects contain:

lm: the fitted linear model
info: a list containing various information extracted from the model (detailed further)
origin: the date corresponding to day '0'

The $info item is a list containing:

r: the growth rate
r.conf: the confidence interval of 'r'
pred: a data.frame containing predictions of the model, including the true dates (dates), their numeric version used in the model (dates.x), the predicted value (fit), and the lower (lwr) and upper (upr) bounds of the associated confidence interval.
doubling: the predicted doubling time in days; exists only if 'r' is positive
doubling.conf: the confidence interval of the doubling time
halving: the predicted halving time in days; exists only if 'r' is negative
halving.conf: the confidence interval of the halving time

For fit_optim_split, a list containing:

df: a data.frame of dates that were used in the optimization procedure, and the corresponding average R2 of the resulting models.
split: the optimal splitting date
fit: the resulting incidence_fit objects
plot: a plot showing the content of df (ggplot2 object)

Examples

Run this code

# NOT RUN {
if (require(outbreaks)) {
  dat <- ebola_sim$linelist$date_of_onset

 ## EXAMPLE WITH A SINGLE MODEL

  ## compute weekly incidence
  i.7 <- incidence(dat, interval=7)
  plot(i.7)
  plot(i.7[1:20])

  ## fit a model on the first 20 weeks
  f <- fit(i.7[1:20])
  f
  names(f)
  head(f$pred)

  ## plot model alone (not recommended)
  plot(f)

 ## plot data and model (recommended)
 plot(i.7, fit=f)
 plot(i.7[1:25], fit=f)


## EXAMPLE WITH 2 PHASES
 ## specifying the peak manually
 f2 <- fit(i.7, split=as.Date("2014-10-15"))
 f2
 plot(i.7, fit=f2)

## finding the best 'peak' date
f3 <- fit_optim_split(i.7)
f3
plot(i.7, fit=f3$fit)
}

# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples