read_logs: Fetch and load logs from RStudio mirror

Description

Fetch R-package download logs from the RStudio mirror to the directory specified if the logs don't already exist and read them with fread from the data.table package.

Usage

read_logs(start=Sys.Date()-30L, end=Sys.Date(), dir="cran-mirror", verbose=TRUE)

Arguments

start

Fetch download logs from this date (format: YYYY-MM-DD). Default is Sys.Date()-30L.

end

Fetch download logs until this date (format: YYYY-MM-DD). Default is Sys.Date().

dir

Complete path to the directory where the logs should be downloaded to. Default is to download to ./cran-mirror.

verbose

TRUE provides informative messages to the console.

select

Character vector of column names to load from the logs. Default is c("date", "time", "package", "country", "ip_id"). NULL loads all columns.

Value

A data.table of all the download logs that are available between start and end dates.

Details

read_logs downloads all missing logs from start until end to the directory provided in dir. After downloading all the logs, they will be automatically replaced with corresponding unzipped versions. Logs for which an unzipped version already exists will be skipped. If a log that was downloaded was corrupted, download + unzip will be attemped once again. If it fails the second time as well, that log is skipped.

Following that the logs will be read in using data.table::fread.

As long as unzipped logs exist in the directory, they won't be downloaded again. So, it is preferred to keep using the same directory without deleting the logs that've been already downloaded, to save time.

Examples

Run this code

## Not run: 
# ## download all available logs for the last 31 days
# dt = read_logs(dir="./cran-mirror", verbose=TRUE)
# ## End(Not run)