Learn R Programming

⚠️There's a newer version (1.7.1) of this package.Take me there.

EpiNow2: Estimate real-time case counts and time-varying epidemiological parameters

This package estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al.), and current best practices (Gostic et al.). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is under active development.

It estimates the time-varying reproduction number on cases by date of infection (using a similar approach to that implemented in {EpiEstim}). Imputed infections are then mapped to observed data (for example cases by date of report) via a series of uncertain delay distributions (in the examples in the package documentation these are an incubation period and a reporting delay) and a reporting model that can include weekly periodicity.

Uncertainty is propagated from all inputs into the final parameter estimates, helping to mitigate spurious findings. This is handled internally. The time-varying reproduction estimates and the uncertain generation time also give time-varying estimates of the rate of growth.

The default model uses a non-stationary Gaussian process to estimate the time-varying reproduction number and then infer infections. Other options include:

  • A stationary Gaussian process (faster to estimate but currently gives reduced performance for real time estimates).
  • User specified breakpoints.
  • A fixed reproduction number.
  • As piecewise constant by combining a fixed reproduction number with breakpoints.
  • As a random walk (by combining a fixed reproduction number with regularly spaced breakpoints (i.e weekly)).
  • Inferring infections using back-calculation and then calculating the time-varying reproduction number.

The documentation for estimate_infections provides examples of the different options available.

Forecasting is also supported for the time-varying reproduction number, infections and reported cases. The time-varying reproduction number can be forecast forwards in time using an integration with the {EpiSoon} package, and converted to a case forecast using the renewal equation. Alternatively, the time-varying reproduction number and cases can be forecast using a Gaussian process.

A simple example of using the package to estimate a national Rt for Covid-19 can be found here.

Installation

Install the stable version of the package:

install.packages("EpiNow2")

Install the stable development version of the package using {drat}:

install.packages("drat")
drat:::add("epiforecasts")
install.packages("EpiNow2")

Install the unstable development version of the package with:

remotes::install_github("epiforecasts/EpiNow2")

Windows users will need a working installation of Rtools in order to build the package from source. See here for a guide to installing Rtools for use with Stan (which is the statistical modelling platform used for the underlying model). For simple deployment/development a prebuilt docker image is also available (see documentation here).

Quick start

{EpiNow2} is designed to be used with a single function call or to be used in an ad-hoc fashion via individual function calls. The core functions of {EpiNow2} are the two single-call functions epinow, regional_epinow, plus functions estimate_infections, and forecast_infections. In the following section we give an overview of the simple use case for epinow and regional_epinow. estimate_infections can be use on its own to infer the underlying infection case curve from reported cases and estimate Rt. Estimating the underlying infection case curve via back-calculation (and then calculating Rt) is substantially less computationally demanding than generating using default settings but may result in less reliable estimates of Rt. For more details on using each function see the function documentation.

The first step to using the package is to load it as follows.

library(EpiNow2)

Reporting delays, incubation period and generation time

Distributions can either be fitted using package functionality or determined elsewhere and then defined with uncertainty for use in {EpiNow2}. When data is supplied a subsampled bootstrapped lognormal will be fit (to account for uncertainty in the observed data without being biased by changes in incidence). An arbitrary number of delay distributions are supported with the most common use case likely to be a incubation period followed by a reporting delay.

reporting_delay <- estimate_delay(rlnorm(1000,  log(3), 1),
                                  max_value = 15, bootstraps = 1)

Here we define the incubation period and generation time based on literature estimates for Covid-19 (see here for the code that generates these estimates).

generation_time <- get_generation_time(disease = "SARS-CoV-2", source = "ganyani")
incubation_period <- get_incubation_period(disease = "SARS-CoV-2", source = "lauer")

epinow

This function represents the core functionality of the package and includes results reporting, plotting and optional saving. It requires a data frame of cases by date of report and the distributions defined above. An additional forecasting module is supported via EpiSoon and companion packages (see documentation for an example).

Load example case data from {EpiNow2}.

reported_cases <- example_confirmed[1:90]
head(reported_cases)
#>          date confirm
#> 1: 2020-02-22      14
#> 2: 2020-02-23      62
#> 3: 2020-02-24      53
#> 4: 2020-02-25      97
#> 5: 2020-02-26      93
#> 6: 2020-02-27      78

Estimate cases by date of infection, the time-varying reproduction number, the rate of growth and forecast these estimates into the future by 7 days. Summarise the posterior and return a summary table and plots for reporting purposes. If a target_folder is supplied results can be internally saved (with the option to also turn off explicit returning of results). Note: For real use cases more samples and a longer warm up may be needed. See fitting progress by setting verbose = TRUE.

estimates <- epinow(reported_cases = reported_cases, 
                    generation_time = generation_time,
                    delays = delay_opts(incubation_period, reporting_delay),
                    rt = rt_opts(prior = list(mean = 2, sd = 0.2)),
                    stan = stan_opts(cores = 4))
names(estimates)
#> [1] "estimates"                "estimated_reported_cases"
#> [3] "summary"                  "plots"                   
#> [5] "timing"

Both summary measures and posterior samples are returned for all parameters in an easily explored format which can be accessed using summary. The default is to return a summary table of estimates for key parameters at the latest date partially supported by data.

knitr::kable(summary(estimates))
measureestimate
New confirmed cases by infection date515 (289 – 1038)
Expected change in daily casesUnsure
Effective reproduction no.0.9 (0.7 – 1.2)
Rate of growth-0.04 (-0.1 – 0.05)
Doubling/halving time (days)-18.2 (13.5 – -7)

Summarised parameter estimates can also easily be returned, either filtered for a single parameter or for all parameters.

head(summary(estimates, type = "parameters", params = "R"))
#>          date variable strat     type   median     mean         sd lower_90
#> 1: 2020-02-22        R  <NA> estimate 2.111871 2.119982 0.13864546 1.899283
#> 2: 2020-02-23        R  <NA> estimate 2.086460 2.094418 0.11619947 1.908121
#> 3: 2020-02-24        R  <NA> estimate 2.061035 2.067174 0.09690610 1.912490
#> 4: 2020-02-25        R  <NA> estimate 2.032317 2.038242 0.08093213 1.904885
#> 5: 2020-02-26        R  <NA> estimate 2.004828 2.007648 0.06841627 1.897507
#> 6: 2020-02-27        R  <NA> estimate 1.974385 1.975444 0.05932968 1.880276
#>    lower_50 lower_20 upper_20 upper_50 upper_90
#> 1: 2.021864 2.077693 2.143147 2.210176 2.361724
#> 2: 2.013526 2.057374 2.115925 2.172291 2.296242
#> 3: 2.000372 2.035607 2.086856 2.133616 2.233151
#> 4: 1.983199 2.013435 2.056292 2.093377 2.174351
#> 5: 1.961205 1.989248 2.022717 2.052258 2.123259
#> 6: 1.936145 1.959619 1.989932 2.012621 2.076784

Reported cases are returned in a separate data frame in order to streamline the reporting of forecasts and for model evaluation.

head(summary(estimates, output = "estimated_reported_cases"))
#>          date  type median    mean       sd lower_90 lower_50 lower_20 upper_20
#> 1: 2020-02-22 gp_rt   48.0  49.051 12.75936    30.00       40       45       51
#> 2: 2020-02-23 gp_rt   65.0  66.375 16.15052    41.00       55       61       70
#> 3: 2020-02-24 gp_rt   72.5  73.982 17.47768    46.95       62       69       77
#> 4: 2020-02-25 gp_rt   75.0  76.623 18.41798    48.95       64       70       80
#> 5: 2020-02-26 gp_rt   96.0  98.136 22.77899    63.95       82       91      101
#> 6: 2020-02-27 gp_rt  129.0 130.737 29.66405    84.00      109      123      136
#>    upper_50 upper_90
#> 1:       57       72
#> 2:       77       95
#> 3:       85      104
#> 4:       88      109
#> 5:      112      138
#> 6:      149      184

A range of plots are returned (with the single summary plot shown below). These plots can also be generated using the following plot method.

plot(estimates)

regional_epinow

The regional_epinow function runs the epinow function across multiple regions in an efficient manner.

Define cases in multiple regions delineated by the region variable.

reported_cases <- data.table::rbindlist(list(
   data.table::copy(reported_cases)[, region := "testland"],
   reported_cases[, region := "realland"]))
head(reported_cases)
#>          date confirm   region
#> 1: 2020-02-22      14 testland
#> 2: 2020-02-23      62 testland
#> 3: 2020-02-24      53 testland
#> 4: 2020-02-25      97 testland
#> 5: 2020-02-26      93 testland
#> 6: 2020-02-27      78 testland

Calling regional_epinow runs the epinow on each region in turn (or in parallel depending on the settings used).

estimates <- regional_epinow(reported_cases = reported_cases, 
                             generation_time = generation_time,
                             delays = delay_opts(incubation_period, reporting_delay),
                             rt = rt_opts(prior = list(mean = 2, sd = 0.2)),
                             stan = stan_opts(cores = 4))
#> INFO [2020-11-12 15:49:50] Producing following optional outputs: regions, summary, samples, plots, latest
#> INFO [2020-11-12 15:49:50] Reporting estimates using data up to: 2020-05-21
#> INFO [2020-11-12 15:49:50] No target directory specified so returning output
#> INFO [2020-11-12 15:49:50] Producing estimates for: testland, realland
#> INFO [2020-11-12 15:49:50] Regions excluded: none
#> INFO [2020-11-12 15:49:50] Showing progress using progressr. Modify this behaviour using progressr::handlers.
#> INFO [2020-11-12 16:00:20] Completed estimates for: testland
#> INFO [2020-11-12 16:10:54] Completed estimates for: realland
#> INFO [2020-11-12 16:10:54] Completed regional estimates
#> INFO [2020-11-12 16:10:54] Regions with estimates: 2
#> INFO [2020-11-12 16:10:54] Regions with runtime errors: 0
#> INFO [2020-11-12 16:10:54] Producing summary
#> INFO [2020-11-12 16:10:54] No summary directory specified so returning summary output
#> INFO [2020-11-12 16:10:55] No target directory specified so returning timings

Results from each region are stored in a regional list with across region summary measures and plots stored in a summary list. All results can be set to be internally saved by setting the target_folder and summary_dir arguments. Each region can be estimated in parallel using the {future} package (when in most scenarios cores should be set to 1). For routine use each MCMC chain can also be run in parallel (with future = TRUE) with a time out (max_execution_time) allowing for partial results to be returned if a subset of chains is running longer than expected. See the documentation for the {future} package for details on nested futures.

Summary measures that are returned include a table formatted for reporting (along with raw results for further processing). Futures updated will extend the S3 methods used above to smooth access to this output.

knitr::kable(estimates$summary$summarised_results$table)
RegionNew confirmed cases by infection dateExpected change in daily casesEffective reproduction no.Rate of growthDoubling/halving time (days)
realland520 (285 – 924)Unsure0.9 (0.6 – 1.2)-0.03 (-0.1 – 0.04)-20 (17.1 – -6.8)
testland513 (297 – 996)Unsure0.9 (0.7 – 1.2)-0.04 (-0.1 – 0.05)-18.5 (14.1 – -6.9)

A range of plots are again returned (with the single summary plot shown below).

estimates$summary$summary_plot

Reporting templates

Rmarkdown templates are provided in the package (templates) for semi-automated reporting of estimates. These are currently undocumented but an example integration can be seen here. If using these templates to report your results please highlight our limitations as these are key to understanding the results from {EpiNow2} .

Interactive figures

{EpiNow2} is integrated with the {RtD3} package which provides interactive visualisations of Rt estimates. See the package documentation for details.

Contributing

File an issue here if you have identified an issue with the package. Please note that due to operational constraints priority will be given to users informing government policy or offering methodological insights. We welcome all contributions, in particular those that improve the approach or the robustness of the code base.

Copy Link

Version

Install

install.packages('EpiNow2')

Monthly Downloads

637

Version

1.3.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Sam Abbott

Last Published

December 14th, 2020

Functions in EpiNow2 (1.3.2)

R_to_growth

Convert Reproduction Numbers to Growth Rates
allocate_empty

Allocate Empty Parameters to a List
allocate_delays

Allocate Delays into Required Stan Format
calc_summary_measures

Calculate All Summary Measures
adjust_infection_to_report

Adjust from Case Counts by Infection Date to Date of Report
backcalc_opts

Back Calculation Options
calc_CrIs

Calculate Credible Intervals
bootstrapped_dist_fit

Fit a Subsampled Bootstrap to Integer Values and Summarise Distribution Parameters
EpiNow2-package

EpiNow2: Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
calc_CrI

Calculate Credible Interval
convert_to_logmean

Convert mean and sd to log mean for a log normal distribution
clean_nowcasts

Clean Nowcasts for a Supplied Date
create_stan_data

Create Stan Data Required for estimate_infections
calc_summary_stats

Calculate Summary Statistics
create_stan_args

Create a List of Stan Arguments
create_initial_conditions

Create Initial Conditions Generating Function
create_obs_model

Create Observation Model Settings
copy_results_to_latest

Copy Results From Dated Folder to Latest
dist_fit

Fit an Integer Adjusted Exponential, Gamma or Lognormal distributions
country_map

Generate a country map for a single variable.
delay_opts

Delay Distribution Options
create_backcalc_data

Create Back Calculation Data
dist_skel

Distribution Skeleton
epinow

Real-time Rt Estimation, Forecasting and Reporting
clean_regions

Clean Regions
construct_output

Construct Output
extract_static_parameter

Extract Samples from a Parameter with a Single Dimension
create_rt_data

Create Time-varying Reproduction Number Data
estimate_delay

Estimate a Delay Distribution
create_shifted_cases

Create Delay Shifted Cases
create_clean_reported_cases

Create Clean Reported Cases
estimate_infections

Estimate Infections, the Time-Varying Reproduction Number and the Rate of Growth
example_confirmed

Example Confirmed Case Data Set
estimates_by_report_date

Estimate Cases by Report Date
fit_model_with_vb

Fit a Stan Model using Variational Inference
fit_model_with_nuts

Fit a Stan Model using the NUTs sampler
get_generation_time

Get a Literature Distribution for the Generation Time
get_raw_result

Get a Single Raw Result
forecast_infections

Forecast Infections and the Time-Varying Reproduction Number
get_incubation_period

Get a Literature Distribution for the Incubation Period
create_gp_data

Create Gaussian Process Data
create_future_rt

Construct the Required Future Rt assumption
gamma_dist_def

Generate a Gamma Distribution Definition Based on Parameter Estimates
forecast_secondary

Forecast Secondary Observations Given a Fit from estimate_secondary
plot.epinow

Plot method for epinow
opts_list

Return an _opts List per Region
format_fit

Format Posterior Samples
convert_to_logsd

Convert mean and sd to log standard deviation for a log normal distribution
estimate_secondary

Estimate a Secondary Observation from a Primary Observation
regional_summary

Regional Summary Output
extract_CrIs

Extract Credible Intervals Present
expose_stan_fns

Expose internal package stan functions in R
report_cases

Report case counts by date of report
make_conf

Format Credible Intervals
regional_epinow

Real-time Rt Estimation, Forecasting and Reporting by Region
get_regional_results

Get Combined Regional Results
map_prob_change

Categorise the Probability of Change for Rt
summary.epinow

Summary output from epinow
growth_to_R

Convert Growth Rates to Reproduction numbers.
incubation_periods

Literature Estimates of Incubation Periods
regional_runtimes

Summarise Regional Runtimes
setup_default_logging

Setup Default Logging
filter_opts

Filter Options for a Target Region
setup_target_folder

Setup Target Folder for Saving
setup_dt

Convert to Data Table
lognorm_dist_def

Generate a Log Normal Distribution Definition Based on Parameter Estimates
get_dist

Get a Literature Distribution
init_cumulative_fit

Generate initial conditions by fitting to cumulative cases
generation_times

Literature Estimates of Generation Times
extract_parameter_samples

Extract Parameter Samples from a Stan Model
summary.estimate_infections

Summary output from estimate_infections
process_region

Process regional estimate
plot.estimate_truncation

Plot method for estimate_truncation
extract_stan_param

Extract a Parameter Summary from a Stan Object
obs_opts

Observation Model Options
get_regions

Get Folders with Results
match_output_arguments

Match User Supplied Arguments with Supported Options
extract_inits

Generate initial conditions from a Stan fit
estimate_truncation

Estimate Truncation of Observed Data
get_regions_with_most_reports

Get Regions with Most Reported Cases
extract_parameter

Extract Samples for a Parameter from a Stan model
plot_estimates

Plot Estimates
plot_summary

Plot a Summary of the Latest Results
process_regions

Process all Region Estimates
rstan_sampling_opts

Rstan Sampling Options
save_input

Save Observed Data
save_estimate_infections

Save Estimated Infections
rstan_opts

Rstan Options
global_map

Generate a global map for a single variable.
rstan_vb_opts

Rstan Variational Bayes Options
plot.estimate_secondary

Plot method for estimate_secondary
gp_opts

Approximate Gaussian Process Settings
plot_CrIs

Plot EpiNow2 Credible Intervals
plot.estimate_infections

Plot method for estimate_infections
simulate_cases

Simulate Cases by Date of Infection, Onset and Report
report_plots

Report plots
trunc_opts

Truncation Distribution Options
theme_map

Custom Map Theme
secondary_opts

Secondary Reports Options
report_summary

Provide Summary Statistics for Estimated Infections and Rt
rt_opts

Time-Varying Reproduction Number Options
run_region

Run epinow with Regional Processing Code
update_list

Update a List
simulate_infections

Simulate infections using a given trajectory of the time-varying reproduction number
save_forecast_infections

Save Forecast Infections
setup_future

Set up Future Backend
setup_logging

Setup Logging
stan_opts

Stan Options
tune_inv_gamma

Tune an Inverse Gamma to Achieve the Target Truncation
summarise_key_measures

Summarise rt and cases
summarise_results

Summarise Real-time Results
sample_approx_dist

Approximate Sampling a Distribution using Counts
update_horizon

Updates Forecast Horizon Based on Input Data and Target