Learn R Programming

robotstxt (version 0.5.2)

get_robotstxts: function to get multiple robotstxt files

Description

function to get multiple robotstxt files

Usage

get_robotstxts(domain, warn = TRUE, force = FALSE,
  user_agent = utils::sessionInfo()$R.version$version.string,
  ssl_verifypeer = c(1, 0), use_futures = FALSE)

Arguments

domain

domain from which to download robots.txt file

warn

warn about being unable to download domain/robots.txt because of

force

if TRUE instead of using possible cached results the function will re-download the robotstxt file HTTP response status 404. If this happens,

user_agent

HTTP user-agent string to be used to retrieve robots.txt file from domain

ssl_verifypeer

analog to CURL option https://curl.haxx.se/libcurl/c/CURLOPT_SSL_VERIFYPEER.html -- and might help with robots.txt file retrieval in some cases #'

use_futures

Should future::future_lapply be used for possible parallel/async retrieval or not. Note: check out help pages and vignettes of package future on how to set up plans for future execution because the robotstxt package does not do it on its own.