RCurl (version 1.95-4.7)

dynCurlReader: Dynamically determine content-type of body from HTTP header and set body reader

Description

This function is used for the writefunction option in a curl HTTP request. The idea is that we read the header of the HTTP response and when our code determines that the header is complete (the presence of a blank line), it examines the contents of the header and finds a Content-Type field. It uses the value of this to determine the nature of the body of the HTTP response and dynamically (re)sets the reader for the curl handle appropriately. If the content is binary, it collects the content into a raw vector; if it is text, it sets the appropriate character encoding and collects the content into a character vector.

This function is like basicTextGatherer but behaves dynamically by determining how to read the content based on the header of the HTTP response. This function returns a list of functions that are used to update and query a shared state across calls.

Usage

dynCurlReader(curl = getCurlHandle(), txt = character(), max = NA,
              value = NULL, verbose = FALSE, binary = NA, baseURL = NA,
              isHTTP = NA, encoding = NA)

Arguments

curl
the curl handle to be used for the request. It is essential that this handle be used in the low-level call to curlPerform so that the update element sets the reader for the body on the appro
txt
initial value of the text. This is almost always an empty character vector.
max
the maximum number of characters to read. This is almost always NA.
value
a function that can be specified which will be used to convert the body of the response from text or raw in a customized manner, e.g. uncompress a gzip body. This can als be done explicitly with a call fun(reader$value()) afte
verbose
a logical value indicating whether messages about progress and operations are written on the console as the header and body are processed.
binary
a logical value indicating whether the caller knows whether the resulting content is binary (TRUE) or not (FALSE) or unknown (NA).
baseURL
the URL of the request which can be used to follow links to other URLs that are described relative to this.
isHTTP
a logical value indicating whether the request/download uses HTTP or not. If this is NA, we determine this when the header is received. If the caller knows this is an FTP or other request, they can specify this when creating the rea
encoding
a string that allows the caller to specify and override the encoding of the result. This is used to convert text returned from the server.

Value

  • A list with 5 elements all of which are functions. These are
  • updatethe function that does the actual reading/processing of the content that libcurl passes to it from the header and the body. This is the work-horse of the reader.
  • valuea function to get the body of the response
  • headera function to get the content of the HTPP header
  • reseta function to reset the internal contents which allows the same reader to be re-used in subsequent HTTP requests
  • curlaccessor function for the curl handle specified in the call to create this dynamic reader object.
  • This list has the S3 class vector c("DynamicRCurlTextHandler", "RCurlTextHandler", "RCurlCallbackFunction")

concept

binary

References

libcurl http://curl.haxx.se

See Also

basicTextGatherer curlPerform getURLContent

Examples

Run this code
# Each of these examples can be done with getURLContent().
   # These are here just to illustrate the dynamic reader.
if(url.exists("http://www.omegahat.org/Rcartogram/demo.jpg")) {
  header = dynCurlReader()
  curlPerform(url = "http://www.omegahat.org/Rcartogram/demo.jpg",
              headerfunction = header$update, curl = header$curl())
  class( header$value() )
  length( header$value() )
}

if(url.exists("http://www.omegahat.org/dd.gz")) {
     # gzip example.
  header = dynCurlReader()
  curlPerform(url = "http://www.omegahat.org/dd.gz",
              headerfunction = header$update, curl = header$curl())
  class( header$value() )
  length( header$value() )

  if(require(Rcompression))
     gunzip(header$value())
}


   # Character encoding example
header = dynCurlReader()
  curlPerform(url = "http://www.razorvine.net/test/utf8form/formaccepter.sn",
               postfields = c(text = "ABC", outputencoding =  "UTF-8"),
               verbose = TRUE,
               writefunction = header$update, curl = header$curl())
  class( header$value() )
  Encoding( header$value() )

Run the code above in your browser using DataLab