rdwd (version 1.2.0)

dataDWD: Download data from the DWD CDC FTP Server

Description

Get climate data from the German Weather Service (DWD) FTP-server. The desired .zip (or .txt) dataset is downloaded into dir. If read=TRUE, it is also read, processed and returned as a data.frame. To solve "errors in download.file: cannot open URL", see https://bookdown.org/brry/rdwd/station-selection.html#fileindex.

Usage

dataDWD(file, base = dwdbase, joinbf = FALSE, dir = "DWDdata",
  force = FALSE, overwrite = FALSE, sleep = 0, quiet = FALSE,
  progbar = !quiet, browse = FALSE, read = TRUE, ntrunc = 2,
  dfargs = NULL, ...)

Arguments

file

Char (vector): complete file URL(s) (including base and filename.zip) as returned by selectDWD. Can be a vector with several filenames.

base

Single char: base URL that will be removed from output file names. DEFAULT: dwdbase

joinbf

Logical: paste base and file together? DEFAULT: FALSE (selectDWD returns complete URLs already)

dir

Char: Writeable directory name where to save the downloaded file. Created if not existent. DEFAULT: "DWDdata" at current getwd()

force

Logical (vector): always download, even if the file already exists in dir? Use NA to force re-downloading files older than 24 hours. Use a numerical value to force after that amount of hours. Note you might want to set overwrite=TRUE as well. If FALSE, the file is still read (or name returned). DEFAULT: FALSE

overwrite

Logical (vector): if force=TRUE, overwrite the existing file rather than append "_1"/"_2" etc to the filename? DEFAULT: FALSE

sleep

Number. If not 0, a random number of seconds between 0 and sleep is passed to Sys.sleep after each download to avoid getting kicked off the FTP-Server. DEFAULT: 0

quiet

Logical: suppress message about directory / filenames? DEFAULT: FALSE

progbar

Logical: present a progress bar with estimated remaining time? If missing and length(file)==1, progbar is internally set to FALSE. Only works if the R package pbapply is available. DEFAULT: TRUE (!quiet)

browse

Logical: open repository via browseURL and return URL folder path? If TRUE, no data is downloaded. If file has several values, only unique folders will be opened. DEFAULT: FALSE

read

Logical: read the file(s) with readDWD? If FALSE, only download is performed and the filename(s) returned. DEFAULT: TRUE

ntrunc

Single integer: number of filenames printed in messages before they get truncated with message "(and xx more)". DEFAULT: 2

dfargs

Named list of additional arguments passed to download.file

Further arguments passed to readDWD, like fread, varnames etc. Dots were passed to download.file prior to rdwd 0.11.7 (2019-02-25)

Value

Presuming downloading and processing were successful: if read=TRUE, a data.frame of the desired dataset (as returned by readDWD), otherwise the filename as saved on disc (may have "_n" appended in name, see newFilename). If length(file)>1, the output is a list of data.frames / vector of filenames. The output is always invisible.

See Also

selectDWD. readDWD, download.file. https://bookdown.org/brry/rdwd Helpful for plotting: berryFunctions::monthAxis, see also berryFunctions::climateGraph

Examples

Run this code
# NOT RUN {
 ## requires internet connection
# find FTP files for a given station name and file path:
link <- selectDWD("Fuerstenzell", res="hourly", var="wind", per="recent")
# download file:
fname <- dataDWD(link, dir=tempdir(), read=FALSE) ; fname
# dir="DWDdata" is the default directory to store files
# unless force=TRUE, already obtained files will not be downloaded again

# read and plot file:
wind <- readDWD(fname, varnames=TRUE) ; head(wind)
metafiles <- readMeta(fname)          ; str(metafiles, max.level=1)
column_names <- readVars(fname)       ; head(column_names)

plot(wind$MESS_DATUM, wind$F, main="DWD hourly wind Fuerstenzell", col="blue",
     xaxt="n", las=1, type="l", xlab="Date", ylab="Hourly Wind speed  [m/s]")
berryFunctions::monthAxis(1)


# current and historical files:
link <- selectDWD("Potsdam", res="daily", var="kl", per="hr"); link
potsdam <- dataDWD(link, dir=tempdir())
potsdam <- do.call(rbind, potsdam) # this will partly overlap in time
plot(TMK~MESS_DATUM, data=tail(potsdam,1500), type="l")
# The straight line marks the jump back in time
# Keep only historical data in the overlap time period:
potsdam <- potsdam[!duplicated(potsdam$MESS_DATUM),]


# With many files (>>50), use sleep to avoid getting kicked off the FTP server
#links <- selectDWD(res="daily", var="solar")
#sol <- dataDWD(links, sleep=20) # random waiting time after download (0 to 20 secs)

# Real life examples can be found in the use cases section of the vignette:
# browseURL("https://bookdown.org/brry/rdwd")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab