A function to scrape files to local csv files
This function uses the monthly list of stations and downloads them to a local directory. There are 7676 files as of July 2011. The function throws warnings about wrong files sizes. These can be ignored or suppressed by setting warning options
scrapeToCsv(Stations, get = seq(from = 1, to = 1e+05), directory = "EnvCanada")
- A data structure returned from
readMonthlyStationsIf the monthly station file already exists, it can simply be read from disk with
- get is assigned to a sequence of numbers that is used to index
the monthly station list. It defaults to 1:100000. This results in the
function trying to download all 7676 files from Env Canada. Alternatively,
one can download the files in chunks, for example setting
getto 1:1000, or any other sequence of numbers. Internal checking ensures that the sequence sought is available for download. Irregular sequences are also supported:
get = c( 23,65,257,7000)would get those elements from the list of stations in monthly.env.csv
- The local directory to write the csv files to. "EnvCanada"
createMonthlyStations is executed the master list is
parsed and only those stations that report monthly are copied into a
file. The file contains a web Id that is used when downloading. To
scrape the files in the monthly data structure youc all
scrapeToCsv and provide a sequence of stations you want to
download. The download will occasionally fail for server timeouts.
By using the function
getMissingScrapes you can determine which
files are missing from the directory. So if you try to download all
7676 files and the server times out after 2365, the function
getMissingScrapes will provide a sequence of files to be
downloaded to complete your scrape.
of values in the "get" parameter.
## Not run: # Stations <- writeMonthlyStations() # scrapeToCsv(Stations,get=1:100) # scrapeToCsv(Stations,get=100:2075) # # ## End(Not run)