srdrn: Super Resolution CNN for Spatial Downscaling

Description

This function implements a Time-aware Super Resolution Deep Neural Network (SRDRN) for spatial downscaling of grid based data. The function allows an option for adding a temporal module for spatio-temporal applications.

Usage

srdrn(
  coarse_data,
  fine_data,
  time_points = NULL,
  val_coarse_data = NULL,
  val_fine_data = NULL,
  val_time_points = NULL,
  cyclical_period = NULL,
  temporal_basis = c(9, 17, 37),
  temporal_layers = c(32, 64, 128),
  temporal_cnn_filters = c(8, 16),
  temporal_cnn_kernel_sizes = list(c(3, 3), c(3, 3)),
  activation = "relu",
  cos_sin_time = FALSE,
  use_batch_norm = FALSE,
  output_channels = 1,
  num_residual_blocks = 3,
  num_res_block_filters = 64,
  upscaling_filters = c(64, 32, 16, 8, 4, 2),
  validation_split = 0,
  start_from_model = NULL,
  metrics = c(),
  epochs = 10,
  batch_size = 32,
  seed = NULL
)

Value

An object of class SRDRN containing:

model: The trained Keras model.
input_mean: The mean value of the input data used for normalization.
input_sd: The standard deviation of the input data used for normalization.
target_mean: The mean value of the target data used for normalization.
target_sd: The standard deviation of the target data used for normalization.
input_mask: A logical array indicating the missing values in the input data.
target_mask: A logical array indicating the missing values in the target data.
min_time_point: The minimum time point in the input data.
max_time_point: The maximum time point in the input data.
cyclical_period: The cyclical period used for temporal encoding.
axis_names: A list containing the names of the axes (longitude, latitude, time).
history: The training history of the model.

Arguments

coarse_data: A 3D array of shape (N_1, N_1, n) representing the coarse resolution input data in grid, where N_1 x N_1 is the coarse resolution and n is the sample size. The two first dimensions are the spatial coordinates and the third dimension refers to the samples (e.g. time).
fine_data: A 3D array of shape (N_2, N_2, n) representing the fine resolution target data in grid, where N_2 x N_2 is the fine resolution and n is the sample size. The two first dimensions are the spatial coordinates and the third dimension refers to the samples (e.g. time).
time_points: An optional numeric vector of length n representing the time points associated with each sample.
val_coarse_data: An optional 3D array of shape (N_1, N_1, n) representing the input validation data.
val_fine_data: An optional 3D array of shape (N_2, N_2, n) representing the target validation data.
val_time_points: An optional numeric vector of length n representing the time points of the validation samples.
cyclical_period: An optional numeric value representing the cyclical period for time encoding (e.g. 365 for yearly seasonality).
temporal_basis: A numeric vector specifying the temporal basis functions to use for time encoding (default is c(9, 17, 37)).
temporal_layers: A numeric vector specifying the number of units in each dense layer for time encoding (default is c(32, 64, 128)).
temporal_cnn_filters: A numeric vector specifying the number of filters in each convolutional layer for temporal feature processing (default is c(8, 16)).
temporal_cnn_kernel_sizes: A list of integer vectors specifying the kernel sizes for each convolutional layer in the temporal feature processing (default is list(c(3, 3), c(3, 3))).
activation: A character string specifying the activation function to use in the model to introduce nonlinearity. The options are listed in https://keras.io/api/layers/activations. Default is "relu".
cos_sin_time: A logical value indicating whether to use cosine and sine transformations for time encoding (default is FALSE).
use_batch_norm: A logical value indicating whether to use batch normalization in the residual blocks (default is FALSE).
output_channels: An integer specifying the number of output channels (default is 1).
num_residual_blocks: An integer specifying the number of residual blocks in the model (default is 3).
num_res_block_filters: A integer specifying the number of filters in each residual block (default is 64).
upscaling_filters: A numeric vector specifying the number of filters in each upsampling layer (by default, the first X values from vector c(64, 32, 16, 8, 4, 2) are selected, where X is the upscaling factor.).
validation_split: A numeric value between 0 and 1 specifying the fraction of the training data to use for validation (default is 0.2).
start_from_model: An optional pre-trained Keras model to continue training from (default is NULL).
metrics: A character vector specifying additional metrics to monitor during training (default is an empty vector).
epochs: An integer specifying the number of training epochs (default is 10).
batch_size: An integer specifying the batch size for training (default is 32).
seed: An optional integer value to set the random seed for reproducibility (default is NULL).

Details

The Super Resolution Deep Residual Network (SRDRN) implements a deep-learning-based spatial downscaling approach inspired by Super-Resolution CNNs (SRCNN) dong2015imageSpatialDownscaling and extended for environmental applications following wang2021deepSpatialDownscaling.

The objective of SRDRN is to learn a mapping from coarse-resolution gridded fields to finer-resolution targets by combining convolutional feature extraction, residual learning, and sub-pixel upsampling. The method is designed for both purely spatial and fully spatio-temporal downscaling when time information is provided. The method consists of the following main components:

Feature Extraction Block: An initial convolutional layer extracts low-level spatial features from the coarse-resolution input.
Residual Blocks: A sequence of residual blocks learn higher-order spatial dependencies. Residual connections stabilize training and allow deeper representations.
Upsampling Module: Sub-pixel convolution (pixel shuffle) layers upscale feature maps to match the high-resolution target grid.

If time_points are provided, the model includes an auxiliary temporal branch. Time is encoded either via:

Radial basis temporal encodings (temporal_basis), or
Cosine–sine cyclical encodings (cos_sin_time = TRUE).

The encoded temporal features pass through a multilayer perceptron (temporal_layers) and are reshaped to spatial form before being concatenated with CNN features. This enables learning time-varying downscaling dynamics (e.g., seasonality, long-term trends). The function supports missing data via masking.

References

Examples

Run this code

# \donttest{
 # Generate dummy low-resolution (16×16) and high-resolution (32×32) data
 n <- 20
 input  <- array(runif(16 * 16 * n),  dim = c(16, 16, n))
 target <- array(runif(32 * 32 * n),  dim = c(32, 32, n))
 
 model1 <- srdrn(
   coarse_data  = input,
   fine_data = target,
   epochs = 1,
   batch_size = 4
 )
# }

Run the code above in your browser using DataLab