source_data: Load plain-text data from a URL (either http or https)
Description
source_data loads plain-text formatted data stored
at a URL (both http and https) into R.Usage
source_data(url, sha1 = NULL, sep = ",",
quote = "\"'", header = TRUE, stringsAsFactors = default.stringsAsFactors(), ...)
Arguments
url
The plain-text formatted data's URL.
sha1
Character string of the file's SHA-1 hash,
generated by source_data. Note if you are using
data stored using Git, this is not the file's commit
SHA-1 hash.
sep
The separator method for the data. For
example, to load comma-separated values data (CSV) use
sep = "," (the default). To load tab-separated
values data (TSV) use sep = "t".
quote
the set of quoting characters. To disable
quoting altogether, use quote = "". See
scan for the behaviour on quotes embedded
in quotes. header
Logical, whether or not the first line of
the file is the header (i.e. variable names). The default
is header = TRUE.
stringsAsFactors
logical. Should character vectors
be converted to factors? Note that this is overridden by
as.is and colClasses, both of which allow
finer control.
source
Based on source_url from the Hadley Wickham's devtools
package.Details
Loads plain-text data (e.g. CSV, TSV) data from a URL.
Works with both HTTP and HTTPS sites. Note: the URL you
give for the url argument must be for the RAW
version of the file. The function should work to download
plain-text data from any secure URL (https), though I have
not verified this.From the source_url documentation: "If a SHA-1 hash is
specified with the sha1 argument, then this function will
check the SHA-1 hash of the downloaded file to make sure it
matches the expected value, and throw an error if it does
not match. If the SHA-1 hash is not specified, it will
print a message displaying the hash of the downloaded file.
The purpose of this is to improve security when running
remotely-hosted code; if you have a hash of the file, you
can be sure that it has not changed."
Examples
Run this code# Download electoral disproportionality data stored on GitHub
# Note: Using shortened URL created by bitly
DisData <- source_data("http://bit.ly/156oQ7a")
# Check to see if SHA-1 hash matches downloaded file
DisDataHash <- source_data("http://bit.ly/Ss6zDO",
sha1 = "dc8110d6dff32f682bd2f2fdbacb89e37b94f95d")
Run the code above in your browser using DataLab