Learn R Programming

flyio - Make data fly to R

Input and output data from R — download, upload, read and write objects from AWS S3, GoogleCloudStorage or local file system from a single interface.

Overview

flyio provides a common interface to interact with data from cloud storage providers or local storage directly from R. It currently supports AWS S3 and Google Cloud Storage, thanks to the API wrappers provided by cloudyr. flyio also supports reading or writing tables, rasters, shapefiles and R objects to the data source from memory.

  • flyio_set_datasource(): Set the data source (GCS, S3 or local) for all the other functions in flyio.
  • flyio_auth(): Authenticate data source (GCS or S3) so that you have access to the data. In a single session, different data sources can be authenticated.
  • flyio_set_bucket(): Set the bucket name once for any or both data sources so that you don't need to write it in each function.
  • list_files(): List the files in the bucket/folder.
  • file_exists(): Check if a file exists in the bucket/folder.
  • export_[file/folder](): Upload a file/folder to S3 or GCS from R.
  • import_file(): Download a file from S3 or GCS.
  • import_[table/raster/stack/shp/rds/rda/st](): Read a file from the set data source and bucket from a user-defined function.
  • export_[table/raster/shp/rds/rda/st](): Write a file to the set data source and bucket from a user-defined function.

For global usage, the datsource, authentication keys and bucket can be set in the environment variables of the machine so that one does not have to input it every time.

  • For datasource:CLOUD_STORAGE_NAME
  • For bucket name: flyioBucketS3 or flyioBucketGcs
  • For authentication: GCS_AUTH_FILE or AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION (For AWS S3, if the awscli is athenticated, then this step is not needed)

Installation

# Install the stable version from CRAN:
install.packages("flyio")

# Install the latest dev version from GitHub:
install.packages("devtools")
devtools::install_github("atlanhq/flyio")

# Load the library
library(flyio)

If you encounter a bug, please file an issue with steps to reproduce it on Github. Please use the same for any feature requests, enhancements or suggestions.

Example

# Setting the data source
flyio_set_datasource("gcs")

# Verify if the data source is set
flyio_get_datasource()

# Authenticate the default data source and set bucket
flyio_auth("key.json")
flyio_set_bucket("atlanhq-flyio")

# Authenticate S3 also
flyio_auth(c("AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY", "AWS_DEFAULT_REGION", "AWS_SESSION_TOKEN"), data_source = "s3")
flyio_set_bucket("atlanhq-flyio", data_source = "s3")

# Listing the files in GCS
list_files(path = "test", pattern = "*csv")

# Saving mtcars to all the data sources using default function write.csv
export_table(mtcars, "~/Downloads/mtcars.csv", data_source = "local")
export_table(mtcars, "test/mtcars.csv") # saving to GCS, need not mention as set globally
export_table(mtcars, "test/mtcars.csv", data_source = "s3")

# Check if the file written exists in GCS
file_exists("test/mtcars.csv")

# Read the file from GCS using readr library
mtcars <- import_table("test/mtcars.csv", FUN = readr::read_csv)

References

Copy Link

Version

Install

install.packages('flyio')

Monthly Downloads

4

Version

0.1.4

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Himanshu Sikaria

Last Published

October 31st, 2019

Functions in flyio (0.1.4)

import_file

Download file from cloud to local system
flyio_get_datasource

Get global data source name for flyio
import_stack

Read stack from GCS/S3 or local
flyio_get_bucket

Get global bucket name for flyio
import_table

Read csv, Excel files, txt
import_shp

Read shapefiles
export_rda

Write RDA files
flyio_get_dir

Get global bucket name for flyio
import_st

Read geojson, geopkg
flyio_list_dir

List files in flyio tmp folder
flyio_set_datasource

Set global data source name for flyio
import_raster

Read raster files
flyio_remove_dir

Delete files in flyio tmp folder
flyio_set_bucket

Set global bucket name for flyio
list_bucket

List buckets for cloud storage
list_files

List the Files in a Directory/Folder
import_rda

Read RDA file
import_rds

Read RDS file
flyio_set_dir

Set global directory for flyio to store data
export_folder

Upload a folder from the local system to cloud
export_rds

Write RDS files
flyio_auth

Authenticate flyio
export_st

Write geojson and geopkgs
export_raster

Write raster
export_file

Upload a file from the local system to cloud
export_shp

Write shapefiles
export_table

Write csv, Excel files, txt
file_exists

Check if a file exists