⚠️There's a newer version (0.1.0) of this package. Take me there.

getSpatialData

Introduction

getSpatialData is an R package in an early development stage that ultimately aims to provide homogeneous function bundles to query, download, prepare and transform various kinds of spatial datasets from open sources, e.g. Satellite sensor data, higher-level environmental data products etc. It supports both sf and sp classes as AOI inputs (see set_aoi in available functions). Due to the early development stage, the included functions and their concepts could be removed or changed in some cases.

For all public functions documentation is available. See also the list of data sources that are or will be implemented.

Installation

To install the current beta version, use devtools.

devtools::install_github("16EAGLE/getSpatialData")

Available Functions

The following functions are publicly available and tested on Linux (Ubuntu 16.04 LTS, 17.10, 18.04 LTS) and Windows 10.

Sentinel

  • getSentinel_query() – querys the Copernicus Open Access Hubs for Sentinel-1, Sentinel-2, Sentinel-3, Sentinel-5 Precursor and Sentinel GNSS data and returns a data frame containing the found records (rows) and their attributes (columns).
  • getSentinel_restore() requests to restore Setninel datasets that have been archived by ESA to the Copernicus Long-Term Archive (LTA) (see argument check_avail of getSentinel_query).
  • getSentinel_preview() – uses the output of getSentinel_query() to preview (quick-look) a user-selected record even before downloading it. By default, the preview is displayed corner-georeferenced in a map viewer in relation to the session AOI.
  • getSentinel_data() – uses the output of getSentinel_query() to download Sentinel data.

Landsat

  • getLandsat_names() – obtains available Landsat product names from USGS Earth Explorer, which can be optionally used with getLandsat_query() to narrow the search.
  • getLandsat_query() – querys USGS Earth Explorer for Landsat data and returns a data frame containing the found records (rows) and their attributes (columns).
  • getLandsat_preview() – uses the output of getLandsat_query() to preview (quick-look) a user-selected record. By default, the preview is displayed corner-georeferenced in a map viewer in relation to the session AOI.
  • getLandsat_data() – uses the output of getLandsat_query() to order and download Landsat data.
    • supports order (on-demand processing) and download of higher-level products (all Landsat products), e.g. top-of-atmosphere (TOA), surface reflectance (SR) or different indices, from USGS-EROS ESPA.
    • supports direct download of Level-1 products (Landsat-8 only) via Amazon Web Services (AWS).
    • will support direct download of Level-1 products (all Landsat products) via USGS EarthExplorer (requires a USGS user profile with machine-to-machine download permission)

MODIS

  • getMODIS_names() – obtains available MODIS product names from USGS Earth Explorer, which can be optionally used with getMODIS_query() to narrow the search.
  • getMODIS_query() – querys USGS Earth Explorer for MODIS data and returns a data frame containing the found records (rows) and their attributes (columns).
  • getMODIS_preview() – uses the output of getMODIS_query() to preview (quick-look) a user-selected record. By default, the preview is displayed corner-georeferenced in a map viewer in relation to the session AOI.
  • getMODIS_data() – uses the output of getMODIS_query() to order and download MODIS data from LAADS.

Preprocessing

  • prepSentinel() beta – makes downloaded Sentinel datasets ready-to-use by automatically inspecting, extracting, sorting and converting the relevant contents of the datasets to a user-defined format.
  • cropFAST() beta – crops a raster file to a spatial extent using GDAL. It is useful when working with large-scale, memory-intensive datasets.

Session Login

  • login_CopHub() – define your Copernicus Open Access login credentials once for the present R session to be able to call each getSentinel* function without defining login arguments each time you use them.
  • login_USGS() – define your USGS login credentials once for the present R session to be able to call each get* function that connects to a USGS service without defining login arguments each time you use them.

Session Settings

  • set_archive() – define a getSpatialData archive directory to which all *_data functions will download data.
  • set_aoi() - draw or define an AOI as sf, sp or matrix object for the running session that can be used by all query functions.
  • view_aoi() - display the session AOI in an interactive mapview/leaflet map viewer.
  • get_aoi() - get the session AOI you have defined or drawn before as sf, sp or matrix object.

Semantics

The following universal semantics on data are used by getSpatialData (from smallest to biggest entity):

  • image: An image of a specific time and spatial extent.
  • record: A set of meta fields identifying and describing a specific image, being part of multiple records in a query.
  • dataset: Smallest entity that is delivered by a service. Might consist of multiple files, including meta data and bandwise imagery. Covers a specific time and spatial extent.
  • product: A data product offered by a specific service, consisting of multiple datasets over a period of time and a wide spatial extent. Might be differentiated by:
    • platform: A general platform design (e.g. "Landsat" or "Sentinel").
    • sensor: Type of sensor which acquired the data from which the product originates (e.g. "MODIS", "MSI" or "OLI").
    • collection: A product version.
    • level: Processing level of the product (e.g. "Level 2A" or "Surface Reflectance").
    • source: The service acquiring, processing or distributing the product (e.g. "ESA Copernicus" or "USGS").

The following universal semantics on computational steps are used by getSpatialData:

  • get: Recieve data from different sources, named either by sensor or platform (whichever is used by the scientific community to referr to the derived products)
    • names: Result of searching available products (differs by source and platform), which might be differentiated further later on (e.g. by level).
    • query: Result of searching a source for data records of a specific or multiple products.
    • preview: Preview a record.
    • data: Result of recieving one or multiple dataset from a source.
  • prep: Prepare/preprocess data obtained with get

Get Started

Sentinel query, preview and download

The following code represents a working chain for querying, filtering, previewing and downloading Sentinel-2 data within R. The procedure can be done for Sentinel-1, Sentinel-2, Sentinel-3 or Sentinel-5P. The workflow for dealing with MODIS and Landsat data using getSpatialData is very similar. See the function reference for detailed examples.

## Load packages
library(getSpatialData)
library(raster)
library(sf)
library(sp)

## Define an AOI (either matrix, sf or sp object)
data("aoi_data") # example aoi

aoi <- aoi_data[[3]] # AOI as matrix object, or better:
aoi <- aoi_data[[2]] # AOI as sp object, or:
aoi <- aoi_data[[1]] # AOI as sf object
#instead, you could define an AOI yourself, e.g. as simple matrix

## set AOI for this session
set_aoi(aoi)
view_aoi() #view AOI in viewer, which will look like this:
#instead of using an existing AOI, you can simply draw one:
set_aoi() #call set_aoi() without argument, which opens a mapedit editor:
## set login credentials and archive directory
login_CopHub(username = "your_username") #asks you for password
set_archive("/path/to/archive/")

## Use getSentinel_query to search for data (using the session AOI)
records <- getSentinel_query(time_range = c("2017-08-01", "2017-08-30"), 
                             platform = "Sentinel-2") #or "Sentinel-1" or "Sentinel-3"

## Filter the records
colnames(records) #see all available filter attributes
unique(records$processinglevel) #use one of the, e.g. to see available processing levels

records_filtered <- records[which(records$processinglevel == "Level-1C"),] #filter by Level
records_filtered <- records_filtered[as.numeric(records_filtered$cloudcoverpercentage) <= 30, ] #filter by clouds

## View records table
View(records)
View(records_filtered)
#browser records or your filtered records
## Preview a single record on a mapview map with session AOI
getSentinel_preview(record = records_filtered[9,])
## Preview a single record on a mapview map without session AOI
getSentinel_preview(record = records_filtered[9,], show_aoi = FALSE)
## Preview a single record as RGB plot
getSentinel_preview(record = records_filtered[9,], on_map = FALSE)
## Download some datasets to your archive directory
datasets <- getSentinel_data(records = records_filtered[c(4,7,9), ])

## Finally, define an output format and make them ready-to-use
datasets_prep <- prepSentinel(datasets, format = "tiff")
# or use VRT to not store duplicates of different formats
datasets_prep <- prepSentinel(datasets, format = "vrt")

## View the files
datasets_prep[[1]][[1]][1] #first dataset, first tile, 10 m resolution
datasets_prep[[1]][[1]][2] #first dataset, first tile, 20 m resolution
datasets_prep[[1]][[1]][3] #first dataset, first tile, 60 m resolution

## Load them directly into R
r <- stack(datasets_prep[[1]][[1]][1])

Parallelized download of MODIS data (user example)

The following example shows how to query and then download MODIS imagery in parallel. This increases the overall download speed if enough bandwith is available to the client. The example has been contributed by Carina Kuebert.

## Load packages for working on multi-core
library(parallel)
library(doParallel)
library(foreach)

## getSpatialData
library(getSpatialData)


#### specify which files to download ####
# specify outdir (where files will be downloaded to)
outdir <- "/path/to/download/directory/"

# load example aoi
data("aoi_data")
set_aoi(aoi_data[[1]]) 
view_aoi()

# check, if service is available
services_avail()

## USGS login
login_USGS(username = "your_username")

# get available products 
product_names <- getMODIS_names()

# query for records for your AOI, time range and product
time_range <-  c("2019-01-01", "2019-01-10")
records <- getMODIS_query(time_range = time_range, name = grep("MOD09GA", product_names, value = T))


#### initiate cluster for paralell download ####
no_cores <- detectCores() - 1
cl <- makeCluster(no_cores, type = "PSOCK")
registerDoParallel(cl)

files <- foreach(i = 1:nrow(records[]), 
                 .combine=c, 
                 .packages='getSpatialData') %dopar% {
                   getMODIS_data(records[i, ], dir_out = outdir)
                 }


#### stop cluster ####
stopCluster(cl)

Products

The following products are being evaluated to be implemented within the package. This also includes sources which can be already accessed through existing packages that could be wrapped behind an standardized R function interface. Please feel free to contribute to the list, e. g. through a pull request:

Product(s)SourceAccessStatusClient(s)
Sentinel (-1/-2/-3,-5P, GNSS)ESA CopernicusCopernicus Open Access Hub APIimplementednative
MODISNASA/USGSORNL DAAC SOAP MODIS web service, LAADS DAAC SOAP/REST web serviceimplementednative
LandsatUSGSUSGS EarthExplorer json API, USGS-EROS ESPA, AWSimplementednative
Global Forest ChangeHansen et al.http://azvoleff.com/articles/analyzing-forest-change-with-gfcanalysisevaluatedR: gfcanalysis?
CMIP5/PMIP3 Global ClimateecoClimatehttp://ecoclimate.org/about/evaluatedR: ecoClimate?
Copernicus Global Land ProductsESA Copernicushttp://land.copernicus.eu/evaluated
CHELSA Global Land ClimateKarger et al.http://chelsa-climate.org/evaluated
Global Forest CoverEU-JRChttp://remote-sensing-biodiversity.org/forest-cover-and-forest-cover-pattern-data-by-jrc/evaluated
Global Surface DynamicsEU-JRChttp://remote-sensing-biodiversity.org/global-water-dynamics-data/evaluated
Global Soil GridsHengl et al.http://remote-sensing-biodiversity.org/global-soil-data-soilgrids/evaluated
Global Urban FootprintEsch et al.https://urban-tep.eo.esa.int/geobrowser/?id=portfolio#!&context=GUF%2FGUF2012-12mevaluated
UK Urban Areas LiDARUK Environment Agencyhttp://remote-sensing-biodiversity.org/free-lidar-data-for-some-uk-cities/evaluated
Global Human Built-up And Settlement Extent (HBASE)Wang et al.http://sedac.ciesin.columbia.edu/data/set/ulandsat-hbase-v1evaluated
GIMMS NDVI3gNASAhttps://nex.nasa.gov/nex/projects/1349/evaluatedR: GIMMS?

Contribution

Contribute! I'm happy about any kind of contribution, from feature ideas, ideas on possible data sources, technical ideas or other to bug fixes, code suggestions or larger code contributions! Open an issue to start a discussion: https://github.com/16eagle/getSpatialData/issues

Mentioned

getSpatialData has been mentioned here:

Kwok, R., 2018. Ecology’s remote-sensing revolution. Nature 556, 137. https://doi.org/10.1038/d41586-018-03924-9

Copy Link

Version

Down Chevron

Version

0.0.4

License

GPL-3 | file LICENSE

Issues

Pull Requests

Stars

Forks

Last Published

June 23rd, 2018

Functions in getSpatialData (0.0.4)