Learn R Programming

geeLite R Package

This package streamlines the process of building, managing, and updating local 'SQLite' databases that contain geospatial features extracted from 'Google Earth Engine' ('GEE').

Installation

# install.packages("devtools")
devtools::install_github("mtkurbucz/geeLite")
geeLite::gee_install()

Usage

  1. Loading the package:
library(geeLite)
  1. Setting the configuration file:
path <- "path/to/db"

set_config(path = path,
           regions = c("SO", "YE"),
           source = list(
              "MODIS/061/MOD13A2" = list(
                "NDVI" = c("mean", "sd")
              )
           ),
           resol = 3,
           start = "2020-01-01")
  1. Collecting GEE data based on the configuration file:
run_geelite(path = path)
#> 
#> ────────────────────────────────────────────────────────────────────────────────
#> geeLite R Package
#> ────────────────────────────────────────────────────────────────────────────────
#> 
#> ── rgee 1.1.7 ─────────────────────────────────────── earthengine-api 0.1.370 ── 
#>  ✔ User: not defined 
#>  ✔ Initializing Google Earth Engine:  DONE!
#>  ✔ Earth Engine account: user
#>  ✔ Python path: C:/.../AppData/Local/r-miniconda/envs/rgee/python.exe 
#> ────────────────────────────────────────────────────────────────────────────────
#>
#> ℹ Database built successfully.
  1. Modifying the configuration file:
modify_config(path = path,
              keys = list(
                c("source", "MODIS/061/MOD13A2", "NDVI"),
                c("source", "MODIS/061/MOD13A2", "EVI")
              ),
              new_values = list(
                c("mean", "min", "max"),
                c("mean", "sd")
              ))
  1. Updating the database based on the configuration file:
run_geelite(path = path)
#> 
#> ────────────────────────────────────────────────────────────────────────────────
#> geeLite R Package
#> ────────────────────────────────────────────────────────────────────────────────
#> 
#> ── rgee 1.1.7 ─────────────────────────────────────── earthengine-api 0.1.370 ── 
#>  ✔ User: not defined 
#>  ✔ Initializing Google Earth Engine:  DONE!
#>  ✔ Earth Engine account: user
#>  ✔ Python path: C:/.../AppData/Local/r-miniconda/envs/rgee/python.exe
#> ────────────────────────────────────────────────────────────────────────────────
#>
#> ℹ Database updated successfully.
  1. Reading the generated database:
# Fetch SQLite database:
# 1) Convert data to daily format and apply default linear interpolation ('prep_fun').
# 2) Aggregate data to default monthly frequency ('freq') using mean and standard deviation aggregation ('aggr_funs').

db <- read_db(path = path, aggr_funs = list(
  function(x) mean(x, na.rm = TRUE),
  function(x) sd(x, na.rm = TRUE))
)

Drive Mode

To efficiently handle large data requests, 'geeLite' provides a 'drive' mode. In this mode, data are exported from 'Google Earth Engine' to 'Google Drive' in parallel batches and then imported into your local 'SQLite' database. Ensure sufficient available storage on your linked Google Drive account before using this mode.

# Collect and store data using drive mode
run_geelite(path = path, mode = "drive")

Command-Line Interface (CLI) Usage

The geeLite package includes a command-line interface (CLI) for advanced users and automation workflows. All major operations can be performed directly from the terminal using Rscript.

The following example demonstrates how to configure and manage a database using the CLI:

# Set the CLI files
Rscript /path/to/geeLite/cli/set_cli.R --path "path/to/db"

# Navigate to the directory where the database will be generated
cd "path/to/db"

# Create a configuration file
Rscript cli/set_config.R --regions "SO YE" --source "list('MODIS/061/MOD13A2' = list('NDVI' = c('mean', 'min')))" --resol 3 --start "2020-01-01"

# Run the data collection based on the configuration
Rscript cli/run_geelite.R

# Modifying the configuration file
Rscript cli/modify_config.R --keys "list(c('source', 'MODIS/061/MOD13A2', 'NDVI'), c('source', 'MODIS/061/MOD13A2', 'EVI'))" --new_values "list(c('mean', 'min', 'max'), c('mean', 'sd'))"

# Update the database with the modified configuration
Rscript cli/run_geelite.R

Citation

If you use geeLite in your research, please cite:

Further Documentation

Additional documentation and usage examples are available at: https://github.com/mtkurbucz/geeLite/tree/main/docs

Data Availability Statement

All geospatial datasets are retrieved from the Google Earth Engine public data catalog. Users must have a registered Google account with GEE access. No proprietary or restricted data are used in this package.

Acknowledgments

Funding by the World Bank’s Food Systems 2030 (FS2030) Multi-Donor Trust Fund program (TF0C0728 and TF0C7822) is gratefully acknowledged. We thank Andres Chamorro and Ben P. Stewart for code testing and comments, as well as Steve Penson, David Newhouse and Alia J. Aghjanian for helpful comments and input. This paper reflects the views of the authors and does not reflect the official views of the World Bank, its Executive Directors, or the countries they represent.

Copy Link

Version

Install

install.packages('geeLite')

Version

1.0.2

License

MPL-2.0

Maintainer

Marcell T. Kurbucz

Last Published

July 21st, 2025

Functions in geeLite (1.0.2)

extract_drive_stats

Extract Large-Scale Statistics in Drive Mode with Fewer Tasks
get_cases

Determine the Cases of Data Collection Requests
get_bins

Get H3 Bins for Shapes
fetch_country_regions

Fetch ISO 3166-1 Country Codes
fetch_vars

Fetch Variable Information from an SQLite Database
get_json

Print JSON File
modify_config

Modify Configuration File
gee_install

Install and Configure a Conda Environment for 'rgee'
output_message

Output Message
get_images

Retrieve Images and Related Information
init_postp

Initialize Post-Processing Folder and Files
linear_interp

Simple Linear Interpolation
remove_tables

Remove Tables from the Database
read_variables

Read Variables from Database
gee_message

Print Google Earth Engine and Python Environment Information
run_geelite

Build and Update the Grid Statistics Database
set_cli

Initialize CLI Files
write_state_file

Write State File
get_state

Print the State File
get_config

Print the Configuration File
get_grid

Obtain H3 Hexagonal Grid
set_dirs

Generate Necessary Directories
source_with_notification

Source an R Script with Notifications About Functions Loaded
process_single_file

Process a Single Source File
gen_messages

Define Output Messages
validate_source_param

Validate Source Parameter
load_external_postp

Load External Post-Processing Functions
set_progress_bar

Set Progress Bar
fetch_regions

Fetch ISO 3166 Country and Subdivision Codes
fetch_state_regions

Fetch ISO 3166-2 Subdivision Codes
write_grid

Write Grid to Database
validate_variables_param

Validate and Process Parameters for Variable Selection and Data Processing
update_grid_stats

Update Grid Statistics
process_source_files

Process Source Files
process_vector

Process Marked Vector
get_task

Generate Session Task
print_version

Display geeLite Package Version
set_config

Initialize the Configuration File
get_shapes

Get Shapes for Specified Regions
get_reducers

Get Reducers
set_depend

Set Dependencies
validate_params

Validate Parameters
read_db

Reading, Aggregating, and Processing the SQLite Database
write_grid_stats

Write Grid Statistics to Database
write_log_file

Write Log File
read_grid

Read Grid from Database
local_chunk_extract

Extract Statistics Locally for a Single Geometry Chunk
compare_lists

Compare Lists and Highlight Differences
clean_drive_folders_by_name

Clean Contents or Entire Google Drive Folders by Name
get_batches

Produce Batches for Build/Update Mixed Cases
check_rgee_ready

Check Google Earth Engine connection
compare_vectors

Compare Vectors and Highlight Differences
get_batch

Create Batches from an sf Object
expand_to_daily

Expand Data to Daily Frequency
db_connect

Create or Open the Database Connection
batch_drive_export

Perform a Single Drive Export for Multiple Geometry Chunks
compile_db

Collect and Process Grid Statistics
dummy_use_for_cran

Internal Dummy Function for Declared Imports
aggr_by_freq

Aggregate Data by Frequency