Learn R Programming

Ecfun (version 0.1-4)

testURLs: Test URLs for intermittent download problems

Description

***NOTE: THIS IS A PRELIMINARY VERSION OF THIS FUNCTION; ***NOTE: IT MAY BE CHANGED OR REMOVED IN A FUTURE RELEASE. try(getURL(...)) to read each element of urls. After each try, write a row to file. indicating which of urls was tested, the test time in seconds, and any error message. Repeat any failures up to maxFail times. After testing each element of urls once, repeat n times. If(ping), preceed each test with "ping url[i]". NOTE: Some Internet Service Providers seem to block some attepts to use "ping" or return fraudulet replies to "ping". It is included in the code, because it seemed like an obvious test. However, it is not executed by default because the results do not necessarily reflect what people might expect from "ping". Return a list of the last successful version read if any from each element of urls with two attributes: (1) "urls" containing the urls argument. (2) "testResults" being an object of class c('testURLs', 'data.frame') of the test results written to file.. This function was written to diagnose a download problem with a particular Internet Service Provider (ISP). For other tools for testing an ISP, see http://www.measurementlab.net/{measurementlab.net} or the "Test your ISP" software discussion by the Electronic Frontier Foundation at the URL mentioned in references below.

Usage

testURLs(urls=c(
 wiki="http://en.wikipedia.org",
 wiki.PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index",
 house="http://house.gov",
 house.reps="http://house.gov/representatives"),
         file.='testURLresults.csv',
         n=10, maxFail=10, warn=-1, tzone='GMT', ping=FALSE, ...)

Arguments

urls
a character vector assumed to be universal resource locators to pass to getURL for testing. The default was selected to provide a 2 x 2 experiment with two different web sites (en.wikip
file.
Name of a CSV file to which to write the results. If the file already exists, new results are appended to it.
n
number of times to repeat the cycle testing each member of urls.
maxFail
max tests for a continually failing URL. This is designed to make it relatively easy to determine determine dependencies between failures. If the failure rate is constant, the number of consecutive failures will follow a Poisson distribut
warn
warn argument to pass to Ping.
tzone
Time zone for Time. Defaults to GMT (UTC). tzone=NULL will use the current locale.
ping
logical: TRUE to include Ping, FALSE otherwise.
...
optional arguments for Ping.

Value

  • an object of class testURLs, which in this case is a list of the last successful result returned by getURL for each element of urls with the following attributes:
  • urlsthe urls argument used for this call
  • testURLresultsan object of class c('testURLs', 'data.frame') of the data written to file.. This has the following columns:
    • Time
    { date() for the time a particular test started } URL{ the name in urls of the URL tested} ping statistics{ several columns with the count and stats returned by Ping. } readTime{ time in seconds for the attempt to read the URL (getURL(urls[j])) to complete. } error{ character: '' if the read attempt was successful; the error message if not. }

Details

for(i in 1:n): 1. pingi <- Ping(urls[i], ...) 2. The time for each call to getURL is computed by computing start.time <- proc.time() before calling try(getURL(.)), then computing the following after: elapsed.time <- max(proc.time() - start.time, na.rm=TRUE) After each of the urls is tested, a summary of the results is appended to file.. This includes the pingi[['stats']], elapsed.time and the error message if the download failed. The Electronic Frontier Foundation provides a table of existing software to "Test your ISP"; see the references below. This table includes a column noting whether the software is "active" (sending test traffic) or "passive" (observing the way the network treats natural traffic). The current testURLs function is "active", because it asks for a copy of the code at the indicated URL.

References

http://www.measurementlab.net/{measurementlab.net} https://www.eff.org/testyourisp{"Test your ISP" software discussion by the Electronic Frontier Foundation} https://www.eff.org/deeplinks/2008/03/keeping-isps-honest{ "active" (sending test traffic) or "passive" (observing the way the network treats natural traffic).}

See Also

try getURL Ping

Examples

Run this code
# Test only 2 web sites, not the default 4,
# and test only twice, not the default 10 times:
tst <- testURLs(c(
 PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index",
 house="http://house.gov/representatives"),
    n=2, maxFail=2)

stopifnot(
(class(tst) == 'testURLs') &&
all(names(tst) == c('PVI', 'house')) &&
all(names(attributes(tst)) ==
    c('names', 'urls', 'testURLresults', 'class'))
)

Run the code above in your browser using DataLab