Coronavirus COVID-19 (2019-nCoV) Epidemic Datasets
The repository aims at unifying COVID-19 datasets across different sources in order to simplify the data acquisition process and the subsequent analysis. You are welcome to join and contribute by extending the number of supporting data sources as a joint effort against COVID-19.
The data are available to the end-user via the R package COVID19 or in csv format (see below or on Kaggle).
About
Goal
Provide the research community with a unified data hub by collecting worldwide fine-grained data merged with demographics, air pollution, and other exogenous variables helpful for a better understanding of COVID-19.
How
The data are collected with the R package COVID19. For R users, the COVID19 package is the recommended way to interact with the dataset. For non R users, the data are provided in csv format and regularly updated (see below or on Kaggle).
Join the mission
Whether or not you are an R user... take part in the data collection! Your contribution will be gratefully acknowledged.
R users
Find real-time data sources and write R function(s) to import the data.
- Find data sources for real-time data such as number of cases, deaths, tests, hospitalized and new variables of this kind. See the data coverage table below to avoid working on something that is already available.
- Write an R function to import the data, just like this.
- Submit your function to this repository by creating a pull request
non-R users
Find historical data sources and put them into csv files.
- Find data sources for historical data such as demographics, population density, age, air quality and new variables of this kind. See the data coverage table below to avoid working on something that is already available.
- Create or improve a csv file, just like this.
- Submit your csv file to this repository by creating a pull request
R Package COVID19
Simple, yet effective R package to acquire tidy format datasets of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic. The data are downloaded in real-time, cleaned and matched with exogenous variables.
Quickstart
# Install COVID19
install.packages("COVID19")
# Load COVID19
require("COVID19")
Data Acquisition
# Diamond Princess
d1 <- diamond()
# World
w1 <- world("country") # data by country
w2 <- world("state") # data by state
# US
u1 <- us("country") # data by country
u1 <- us("state") # data by state
# Italy
i1 <- italy("country") # data by country
i2 <- italy("state") # data by region
i3 <- italy("city") # data by city
# Switzerland
s1 <- switzerland("country") # data by country
s2 <- switzerland("state") # data by canton
# Liechtenstein
l1 <- liechtenstein() # data by country
Data Hub (csv)
Daily updated datasets of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic in csv format. The following table shows the data coverage for each variable in each file.
deaths | confirmed | tests | pop | pop_14 | pop_15_64 | pop_65 | pop_age | pop_density | pop_death_rate | |
---|---|---|---|---|---|---|---|---|---|---|
number of COVID19 deaths | number of COVID19 confirmed cases | number of COVID19 tests | total population | population ages 0-14 (% of total population)* | population ages 15-64 (% of total population)** | population ages 65+ (% of total population) | median age of population | population density per km2 | population mortality rate | |
World | ||||||||||
World: country level | ||||||||||
World: state level | ||||||||||
US | ||||||||||
US: country level | ||||||||||
US: state level | ||||||||||
Italy | ||||||||||
Italy: country level | ||||||||||
Italy: state level | ||||||||||
Italy: city level | ||||||||||
Switzerland | ||||||||||
Switzerland: country level | ||||||||||
Switzerland: state level | ||||||||||
Liechtenstein | ||||||||||
Liechtenstein: country level | ||||||||||
Diamond Princess | ||||||||||
Diamond Princess |
* Switzerland: ages 0-19
** Switzerland: ages 20-64
Data Sources
The following sources are gratefully acknowledged for making the data available to the public.
* Switzerland: ages 0-19
** Switzerland: ages 20-64
Acknowledgements
The following people have contributed to the data collection as a joint effort against COVID-19.
* Switzerland: ages 0-19
** Switzerland: ages 20-64
Use Cases
- Monitoring the advancement of the COVID–19 contagion in the regions of Italy (code)