aws.s3 v0.3.12

0

Monthly downloads

0th

Percentile

'AWS S3' Client Package

A simple client package for the Amazon Web Services ('AWS') Simple Storage Service ('S3') 'REST' 'API' <https://aws.amazon.com/s3/>.

Readme

AWS S3 Client Package

aws.s3 is a simple client package for the Amazon Web Services (AWS) Simple Storage Service (S3) REST API. While other packages currently connect R to S3, they do so incompletely (mapping only some of the API endpoints to R) and most implementations rely on the AWS command-line tools, which users may not have installed on their system.

To use the package, you will need an AWS account and to enter your credentials into R. Your keypair can be generated on the IAM Management Console under the heading Access Keys. Note that you only have access to your secret key once. After it is generated, you need to save it in a secure location. New keypairs can be generated at any time if yours has been lost, stolen, or forgotten. The aws.iam package profiles tools for working with IAM, including creating roles, users, groups, and credentials programmatically; it is not needed to use IAM credentials.

A detailed description of how credentials can be specified is provided at: https://github.com/cloudyr/aws.signature/. The easiest way is to simply set environmetn variables on the command line prior to starting R or via an Renviron.site or .Renviron file, which are used to set environment variables in R during startup (see ? Startup). Or they can be set within R:

Sys.setenv("AWS_ACCESS_KEY_ID" = "mykey",
           "AWS_SECRET_ACCESS_KEY" = "mysecretkey",
           "AWS_DEFAULT_REGION" = "us-east-1",
           "AWS_SESSION_TOKEN" = "mytoken")

To use the package with S3-compatible storage provided by other cloud platforms, set the AWS_S3_ENDPOINT environment variable to the appropriate host name. By default, the package uses the AWS endpoint: s3.amazonaws.com

Code Examples

The package can be used to examine publicly accessible S3 buckets and publicly accessible S3 objects without registering an AWS account. If credentials have been generated in the AWS console and made available in R, you can find your available buckets using:

library("aws.s3")
bucketlist()

If your credentials are incorrect, this function will return an error. Otherwise, it will return a list of information about the buckets you have access to.

Buckets

To get a listing of all objects in a public bucket, simply call

get_bucket(bucket = '1000genomes')

Amazon maintains a listing of Public Data Sets on S3.

To get a listing for all objects in a private bucket, pass your AWS key and secret in as parameters. (As described above, all functions in aws.s3 will look for your keys as environment variables by default, greatly simplifying the process of making a s3 request.)

# specify keys in-line
get_bucket(
  bucket = 'my_bucket',
  key = YOUR_AWS_ACCESS_KEY,
  secret = YOUR_AWS_SECRET_ACCESS_KEY
)

# specify keys as environment variables
Sys.setenv("AWS_ACCESS_KEY_ID" = "mykey",
           "AWS_SECRET_ACCESS_KEY" = "mysecretkey")
get_bucket("my_bucket")

S3 can be a bit picky about region specifications. bucketlist() will return buckets from all regions, but all other functions require specifying a region. A default of "us-east-1" is relied upon if none is specified explicitly and the correct region can't be detected automatically. (Note: using an incorrect region is one of the most common - and hardest to figure out - errors when working with S3.)

Objects

There are eight main functions that will be useful for working with objects in S3:

  1. s3read_using() provides a generic interface for reading from S3 objects using a user-defined function
  2. s3write_using() provides a generic interface for writing to S3 objects using a user-defined function
  3. get_object() returns a raw vector representation of an S3 object. This might then be parsed in a number of ways, such as rawToChar(), xml2::read_xml(), jsonlite::fromJSON(), and so forth depending on the file format of the object
  4. save_object() saves an S3 object to a specified local file
  5. put_object() stores a local file into an S3 bucket
  6. s3save() saves one or more in-memory R objects to an .Rdata file in S3 (analogously to save()). s3saveRDS() is an analogue for saveRDS()
  7. s3load() loads one or more objects into memory from an .Rdata file stored in S3 (analogously to load()). s3readRDS() is an analogue for readRDS()
  8. s3source() sources an R script directly from S3

They behave as you would probably expect:

# save an in-memory R object into S3
s3save(mtcars, bucket = "my_bucket", object = "mtcars.Rdata")

# `load()` R objects from the file
s3load("mtcars.Rdata", bucket = "my_bucket")

# get file as raw vector
get_object("mtcars.Rdata", bucket = "my_bucket")
# alternative 'S3 URI' syntax:
get_object("s3://my_bucket/mtcars.Rdata")

# save file locally
save_object("mtcars.Rdata", file = "mtcars.Rdata", bucket = "my_bucket")

# put local file into S3
put_object(file = "mtcars.Rdata", object = "mtcars2.Rdata", bucket = "my_bucket")

Installation

CRAN Downloads Build Status codecov.io

This package is not yet on CRAN. To install the latest development version you can install from the cloudyr drat repository:

# latest stable version
install.packages("aws.s3", repos = c("cloudyr" = "http://cloudyr.github.io/drat"))

# on windows you may need:
install.packages("aws.s3", repos = c("cloudyr" = "http://cloudyr.github.io/drat"), INSTALL_opts = "--no-multiarch")

Or, to pull a potentially unstable version directly from GitHub:

if (!require("remotes")) {
    install.packages("remotes")
}
remotes::install_github("cloudyr/aws.s3")

cloudyr project logo

Functions in aws.s3

Name Description
s3save save/load
get_bucketname Utility Functions
s3write_using Custom read and write
get_bucket_policy Bucket policies
get_requestpayment requestPayment
put_bucket Create bucket
s3HTTP S3 HTTP Requests
get_versions Bucket versions
s3source Source from S3
get_replication Bucket replication
get_lifecycle Lifecycle
get_notification Notifications
put_object Put object
delete_website Bucket Website configuration
bucket_exists Bucket exists?
get_acceleration Bucket Acceleration
delete_bucket Delete Bucket
delete_object Delete object
get_cors CORS
get_acl Get or put bucket/object ACLs
bucketlist List Buckets
aws.s3-package aws.s3-package
copy_object Copy Objects
getobject Deprecated
get_uploads Multipart uploads
get_object Get object
get_torrent Get object torrent
get_location Bucket location
get_encryption Bucket encryption
get_bucket List bucket contents
get_tagging Bucket tagging
s3sync S3 file sync
s3saveRDS saveRDS/readRDS
No Results!

Last month downloads

Details

Type Package
Date 2018-05-25
License GPL (>= 2)
URL https://github.com/cloudyr/aws.s3
BugReports https://github.com/cloudyr/aws.s3/issues
RoxygenNote 6.0.1
NeedsCompilation no
Packaged 2018-05-25 08:29:49 UTC; THOMAS
Repository CRAN
Date/Publication 2018-05-25 09:33:38 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/aws.s3)](http://www.rdocumentation.org/packages/aws.s3)