Learn R Programming

scrappy: A Simple Web Scraper

The goal of scrappy is to provide simple functions to scrape data from different websites for academic purposes.

Installation

You can install the released version of scrappy from CRAN with:

install.packages("scrappy")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("villegar/scrappy")

Example

NEWA @ Cornell University

The Network for Environment and Weather Applications at Cornell University. Website: http://newa.cornell.edu

# Create RSelenium session
rD <- RSelenium::rsDriver(browser = "firefox", port = 4548L, verbose = FALSE)

# Call scrappy
out <- scrappy::newa_nrcc(client = rD$client, 
                          year = 2020, 
                          month = 12, # December
                          station = "gbe", # Geneve (Bejo) station
                          save_file = FALSE) # Don't save output to a CSV file
# Stop server
rD$server$stop()
#> [1] TRUE

Partial output from the previous example:

Date/TimeAir Temp (℉)Precip (inches)Leaf Wetness (minutes)RH (%)Wind Spd (mph)Wind Dir (degrees)Solar Rad (langleys)Dewpoint (℉)Station
12/31/2020 23:00 EST33.100822.8264028gbe
12/31/2020 22:00 EST33.000803.3250028gbe
12/31/2020 21:00 EST32.800812.6261028gbe
12/31/2020 20:00 EST32.500841.7277028gbe
12/31/2020 19:00 EST32.900812.1279028gbe
12/31/2020 18:00 EST33.300793.0272028gbe
12/31/2020 17:00 EST33.500783.9274127gbe
12/31/2020 16:00 EST34.100744.9272727gbe
12/31/2020 15:00 EST33.800727.1277826gbe
12/31/2020 14:00 EST34.400707.92761326gbe

Copy Link

Version

Install

install.packages('scrappy')

Monthly Downloads

152

Version

0.0.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Roberto Villegas-Diaz

Last Published

January 9th, 2021

Functions in scrappy (0.0.1)

%>%

Pipe operator
newa_nrcc

Retrieve data from NEWA at Cornell University