Select files for downloading with dataDWD()
.
The available folders with datasets are listed at
https://bookdown.org/brry/rdwd/available-datasets.html.
To use an updated index (if necessary), see
https://bookdown.org/brry/rdwd/fileindex.html.
All arguments (except for mindex
, findex
and base
)
can be a vector and will be recycled to the maximum length of all arguments.
If that length > 1, the output is a list of filenames (or vector if outvec=TRUE
).
If station name
is given, but id
is empty (""),
id is inferred via findID()
using mindex
.
If res/var/per
are given and valid (existing in findex
),
they are pasted together to form a path.
Here is an overview of the behavior in each case of availability:
case | | id | | path | | output |
1 | | "" | | "" | | base (and some warnings) |
2 | | "xx" | | "" | | All file names (across paths) for station id |
3 | | "" | | "xx" | | The zip file names at path |
4 | | "xx" | | "xx" | | Regular single data file name |
For case 2, you can explicitly set res="",var="",per=""
to avoid the
default interactive selection.
For case 3 and 4 (path given), you can set meta=TRUE
.
Then selectDWD will return the name of the station description txt file at path.
This is why case 3 with the default meta=FALSE
only returns the
data file names (ending in .zip) and not the description and Beschreibung txt/pdf files.
Open those in a browser with
pdfpath <- grep("daily/kl/h.*DESCRIPTION", fileIndex$path, value=TRUE) browseURL(paste0(dwdbase, "/", pdfpath))
Let me know if besides meta
, pdf
is needed for automated opening.
selectDWD(
name = "",
res = NA,
var = NA,
per = NA,
base = dwdbase,
outvec = any(per %in% c("rh", "hr")),
findex = fileIndex,
remove_dupli = TRUE,
current = FALSE,
id = findID(name, exactmatch = exactmatch, mindex = mindex, quiet = quiet),
mindex = metaIndex,
exactmatch = TRUE,
meta = FALSE,
meta_txt_only = TRUE,
quiet = rdwdquiet(),
...
)
Char: station name(s) passed to findID()
, along with
exactmatch
and mindex
.
All 3 arguments are ignored if id
is given. DEFAULT: ""
Char: temporal resolution available at base
, usually one of
c("hourly","daily","monthly")
, see section 'Description' above.
res/var/per
together form the path.
DEFAULT: NA for interactive selection
Char: weather variable of interest, like e.g.
"air_temperature", "cloudiness", "precipitation", "soil_temperature", "solar", "kl", "more_precip"
See above and in fileIndex
.
DEFAULT: NA for interactive selection
Char: desired time period. One of
"recent" (data from the last year, up to date usually within a few days) or
"historical" (long time series). Can be abbreviated (if the first
letter is "r" or "h", full names are used). To get both datasets,
use per="hr"
or per="rh"
(and outvec=TRUE
).
per
is set to "" if var=="solar".
DEFAULT: NA for interactive selection
Single char: main directory of DWD ftp server.
Must be the same base
used to create findex
.
DEFAULT: dwdbase
Single logical: if path or ID length > 1,
instead of a list, return a vector? (via unlist()
).
DEFAULT: per %in% c("rh","hr")
Single object: Index used to select filename, as returned by
createIndex()
.To use a current / custom index, see
https://bookdown.org/brry/rdwd/fileindex.html.
DEFAULT: fileIndex
Logical: Remove duplicate entries in the fileIndex?
If duplicates are found, a warning will be issued, unless quiet=TRUE
.
The DWD updates files on the server quite often and sometimes
misses removing the old files, leading to duplicates,
usually with differences only in the date range.
A semi-current (manually updated) list of duplicates is on
github.
Before reporting, run updateRdwd()
to see if fileIndex
has been updated.
I email the DWD about duplicates when I find them, they usually fix it soon.
If remove_dupli=TRUE
, only the file with the longer timespan will be kept.
This is selected according to filename, which is not very reliable,
hence manual checking is recommended.
DEFAULT: TRUE
Single logical for case 3/4 with given path
: instead of
findex
, use a list of the currently available files at
base/res/var/per? This will call indexFTP()
, thus
requires availability of the RCurl
package.
DEFAULT: FALSE
Char/Number: station ID with or without leading zeros, e.g. "00614" or 614. Is internally converted to an integer, because some DWD meta data files also contain no leading zeros. DEFAULT: findID(name, exaxtmatch, mindex)
Logical: return metadata txt file name instead of climate data zip file?
Relevant only in case 4 (path and id given) and case 3 for res="multi_annual".
See metaIndex
for a compilation of all metaData files.
DEFAULT: FALSE
Logical: if meta
, only return .txt files, not the
pdf and html files? DEFAULT: TRUE
Suppress id length warnings? DEFAULT: FALSE through rdwdquiet()
Further arguments passed to indexFTP()
if current=TRUE
,
except folder and base.
Character string with file path and name(s) in the format "base/res/var/per/filename.zip"
# NOT RUN {
# Give weather station name (must be existing in metaIndex):
selectDWD("Potsdam", res="daily", var="kl", per="historical")
# all files for all stations matching "Koeln":
selectDWD("Koeln", res="", var="", per="", exactmatch=FALSE)
findID("Koeln", FALSE)
# }
# NOT RUN {
# Excluded from CRAN checks to save time
# selectDWD("Potsdam") # interactive selection of res/var/per
# directly give station ID, can also be id="00386" :
selectDWD(id=386, res="daily", var="kl", per="historical")
# period can be abbreviated:
selectDWD(id="00386", res="daily", var="kl", per="h")
selectDWD(id="00386", res="daily", var="kl", per="h", meta=TRUE)
# vectorizable:
selectDWD(id="01050", res="daily", var="kl", per="rh") # list if outvec=F
selectDWD(id="01050", res=c("daily","monthly"), var="kl", per="r")
# vectorization gives not the outer product, but elementwise comparison:
selectDWD(id="01050", res=c("daily","monthly"), var="kl", per="hr")
# all zip files in all paths matching id:
selectDWD(id=c(1050, 386), res="",var="",per="")
# all zip files in a given path (if ID is empty):
head( selectDWD(id="", res="daily", var="kl", per="recent") )
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab