Learn R Programming

repmis (version 0.2.9)

source_data: Load plain-text data from a URL (either http or https)

Description

source_data loads plain-text formatted data stored at a URL (both http and https) into R.

Usage

source_data(url, sha1 = NULL, cache = FALSE, clearCache = FALSE,
  sep = ",",
  quote = "\"'", header = TRUE, stringsAsFactors = default.stringsAsFactors(), ...)

Arguments

url
The plain-text formatted data's URL.
sha1
Character string of the file's SHA-1 hash, generated by source_data. Note if you are using data stored using Git, this is not the file's commit SHA-1 hash.
cache
logical. Whether or not to cache the data so that it is not downloaded every time the function is called.
clearCache
logical. Whether or not to clear the downloaded data from the cache.
sep
The separator method for the data. For example, to load comma-separated values data (CSV) use sep = "," (the default). To load tab-separated values data (TSV) use sep = "t".
quote
the set of quoting characters. To disable quoting altogether, use quote = "". See scan for the behaviour on quotes embedded in quotes.
header
Logical, whether or not the first line of the file is the header (i.e. variable names). The default is header = TRUE.
stringsAsFactors
logical. Should character vectors be converted to factors? Note that this is overridden by as.is and colClasses, both of which allow finer control.
...
additional arguments passed to read.table.

Value

  • a data frame

source

Based on source_url from the Hadley Wickham's devtools package.

Details

Loads plain-text data (e.g. CSV, TSV) data from a URL. Works with both HTTP and HTTPS sites. Note: the URL you give for the url argument must be for the RAW version of the file. The function should work to download plain-text data from any secure URL (https), though I have not verified this.

From the source_url documentation: "If a SHA-1 hash is specified with the sha1 argument, then this function will check the SHA-1 hash of the downloaded file to make sure it matches the expected value, and throw an error if it does not match. If the SHA-1 hash is not specified, it will print a message displaying the hash of the downloaded file. The purpose of this is to improve security when running remotely-hosted code; if you have a hash of the file, you can be sure that it has not changed."

See Also

httr and read.table

Examples

Run this code
# Download electoral disproportionality data stored on GitHub
# Note: Using shortened URL created by bitly
DisData <- source_data("http://bit.ly/156oQ7a")

# Check to see if SHA-1 hash matches downloaded file
DisDataHash <- source_data("http://bit.ly/Ss6zDO",
   sha1 = "dc8110d6dff32f682bd2f2fdbacb89e37b94f95d")

Run the code above in your browser using DataLab