Learn R Programming

Rcrawler (version 0.1.9-1)

Web Crawler and Scraper

Description

Performs parallel web crawling and web scraping. It is designed to crawl, parse and store web pages to produce data that can be directly used for analysis application. For details see Khalil and Fakir (2017) .

Copy Link

Version

Install

install.packages('Rcrawler')

Monthly Downloads

544

Version

0.1.9-1

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Salim Khalil

Last Published

November 11th, 2018

Functions in Rcrawler (0.1.9-1)

Getencoding

Getencoding
LinkExtractor

LinkExtractor
LoginSession

Open a logged in Session
install_browser

Install PhantomJS webdriver
RobotParser

RobotParser fetch and parse robots.txt
run_browser

Start up web driver process on localhost, with a random port
Rcrawler

Rcrawler
stop_browser

Stop web driver process and Remove its Object
browser_path

Return browser (webdriver) location path
ContentScraper

ContentScraper
LinkNormalization

Link Normalization
Linkparameters

Get the list of parameters and values from an URL
Linkparamsfilter

Link parameters filter
LoadHTMLFiles

LoadHTMLFiles @rdname LoadHTMLFiles
ListProjects

ListProjects
Drv_fetchpage

Fetch page using web driver/Session