wunderscraper v0.1.0


Monthly downloads



Scrape the 'Wunderground' API

A function for randomly sampling from the 'Wunderground' API <http://api.wunderground.com>. Provides multistage sampling strategies, respects API usage limits, and collects and saves 'Wunderground' data.



A package for sampling weather stations via Wunderground


Wunderscraper helps tap and organize a wealth of real-time weather data from Wunderground. The real-time nature of Wunderground's vast network of weather stations must be sampled; it is impossible to collect data from all the stations all the time. Wunderscraper provides flexible spatial and temporal sampling to efficiently build a representation of weather at hyper local scales.




Sampling is a method for constructing a representation of a population. At the heart of sampling theory is independence; sampling one unit shouldn't change the probability of sampling another. Spatial sampling is especially challenging because units are not independent. Measurements at one weather station will be correlated with those at nearby stations. One way to preserve spatial independence is to partition space into units that are independent, and draw a representation from each partition.

Sampling methods offer a couple of basic tools for preserving independence and focusing on a population of interest. Multistage sampling is the primary tool for partitioning a population into independent units. The initial stages draw samples from a large unit, like regions or states, and later stages draw samples from smaller units nested within the larger ones, eg counties or zip codes. Stratified sampling is a tool for ensuring sub-populations recieve adequate coverage. Stratified sampling repeats a sample stage for each sub-population. Stratified sampling is useful for evenly covering sub-populations, or for oversampling a particularly small sub-population. See the examples in the next section for more details.


  • Wunderscraper is integrated with the tigris package for state and county administrative boundaries

    schedulerMMDD <- scheduler()
    ## sample 1 county and collect all weather stations.  Will keep only stations
    ## within the county administrative boundary, as determined from tigris
    scrape(schedulerMMDD, c("GEOID", "ZCTA5", "id"), size=1)
  • Multistage sampling provides efficient coverage over an area of interest

    ## monitor a tri-state area
    triState <- zctaRel[zctaRel $STATEFP %in% c("09", "34", "36"), ]
    repeat scrape(schedulerMMDD, c("STATEFP", "GEOID", "ZCTA5"), size=c(1, 10, 1, 10),
  • Stratified sampling ensures all sub-populations are adequately covered

    ## monitor a tri-state, stratified by state to ensure complete coverage each sample
    repeat scrape(schedulerMMDD, c("GEOID", "ZCTA5"), size=c(10, 1, 10), strata=rep("STATEFP", 3),
  • Set a schedule to control period of repeat samples

    ## monitor a tri-state area with two hour period
    plan(schedulerMMDD, '2 hours')
    repeat {
      scrape(schedulerMMDD, c("GEOID", "ZCTA5"), size=c(10, 1, 10), strata=rep("STATEFP", 3),
  • Create spatial grids on the fly for stages or strata

    ## sample stations at a resolution of 0.01 degrees, one station per grid of resolution
    scrape(schedulerMMDD, c("GEOID", "ZCTA5"), size=c(10, 1, 1), strata=c(NA, NA, "GRID"),
         cellsize=c(NA, 0.01))
  • More examples in scrape


Functions in wunderscraper

Name Description
sync Sync scheduler's schedule to current time
wunderscraper wunderscraper: A package for sampling weather stations via Wunderground
plan Plan a schedule for executing a task
scheduler Schedules wunderscraper
zctaRel Relationships for zip codes
scrape Scrape Wunderground API
setApiKey Set API key.
No Results!

Last month downloads


License MIT + file LICENSE
Encoding UTF-8
LazyData true
RoxygenNote 6.0.1
NeedsCompilation no
Packaged 2018-01-13 12:23:27 UTC; omnia
Repository CRAN
Date/Publication 2018-01-13 17:31:21 UTC
imports httr , jsonlite , sf , tigris
depends R (>= 3.4.0)
suggests testthat

Include our badge in your README