Learn R Programming

An early analysis of the COVID-19 pandemic

Dataset

This dataset is collected from public agencies or news media, containing detailed information about some 1400 COVID-19 cases confirmed in and outside China. This dataset is free to use and share given that appropriate credits are given under the CC-BY-4.0 license. It can be loaded in R as a package:

devtools::install_github("qingyuanzhao/bets.covid19")
library(bets.covid19)
head(covid19_data)

More details about the dataset can be found in

help(covid19_data)

and in this arXiv preprint.

Statistical inference: the BETS model

We have developed a generative model for four key epidemiological events: Beginning of exposure, End of exposure, time of Transmission, and time of Symptom onset (BETS). This package implements a likelihood inference for the BETS model. Try:

help(bets.inference)
example(bets.inference)

Details of the model and methodology can be found in this preprint on arXiv. In short, we find that several published early analyses were severely biased by sample selection. All our analyses, regardless of which subsample and model were being used, point to an epidemic doubling time of 2 to 2.5 days during the early outbreak in Wuhan.

A Bayesian nonparametric analysis further suggests that 5% of the symptomatic cases may not develop symptoms within 14 days since infection. Code for the Bayesian model and MCMC sampler can be found under the bayesian folder.

Reference

  • First report: Qingyuan Zhao, Yang Chen, Dylan S Small. Analysis of the epidemic growth of the early 2019-nCoV outbreak using internationally confirmed cases. medRxiv 2020.02.06.20020941; doi: https://doi.org/10.1101/2020.02.06.20020941
  • Full model: Qingyuan Zhao, Niaoqiao Ju, Sergio Bacallado, Rajen Shah. BETS: The dangers of selection bias in early analyses of the coronavirus disease (COVID-19) pandemic. arXiv:2004.07743.

Acknowledgement

Many people have contributed to the data collection and given helpful suggestions. We thank Rajen Shah, Yachong Yang, Cindy Chen, Yang Chen, Dylan Small, Michael Levy, Hera He, Zilu Zhou, Yunjin Choi, James Robins, Marc Lipsitch, Andrew Rosenfeld.

Earlier work

This project first started from a preliminary analysis of some international COVID-19 cases exported from Wuhan. The report of the first analysis can be found on medRxiv. Code for that analysis can be found in the report1 branch.

Copy Link

Version

Install

install.packages('bets.covid19')

Monthly Downloads

189

Version

1.0.0

License

CC BY 4.0

Issues

Pull Requests

Stars

Forks

Maintainer

Qingyuan Zhao

Last Published

May 12th, 2020

Functions in bets.covid19 (1.0.0)

bets.inference

Likelihood inference
wuhan_exported

COVID-19 exported from Wuhan
age.process

Processing age to print its distribution
preprocess.data

Prepare data frame for analysis
bets.likelihood.conditional

(Profile) Conditional likelihood given B and E
bets.covid19

A package for analyzing early epidemic data
parse.infected

Parse the infected date
simulate.case

Simulate case information from the generative BETS model
parse.one.infected

Parse infected date (basic)
bets.likelihood.unconditional

Approximate profile likelihood
covid19_data

Confirmed cases of COVID-19
date.process

Transform date to numeric
bets.likelihood

(Profile) Likelihood function