Cressman: Cressman Objective Analysis Method

Description

The Cressman objective analysis computes values at grid points $Z_{ij}^a$ (where $i$ and $j$ are the grid point indices for a 2D grid) as the weighted average of the difference between observed values $Z_k^o$ and background values interpolated to the observation locations $Z_k^b$ (i.e., $Z_k^o - Z_k^b$, called the observation increment) plus the background value at the grid point $Z_{ij}^b$.

Usage

Cressman(
  BD_Obs,
  BD_Coord,
  shapefile,
  grid_resolution,
  search_radius,
  training = 1,
  stat_validation = NULL,
  Rain_threshold = NULL,
  save_model = FALSE
)

Value

The return value depends on whether validation has been performed.

Without validation: The function returns a list, where each element is a SpatRaster object containing the interpolated values for a specific search radius defined in search_radius. The number of elements in this list matches the length of search_radius.
With validation: The function returns a named list with two elements:
- Ensamble: A list where each element corresponds to a SpatRaster object containing the interpolated values for a specific search radius in search_radius.
- Validation: A list where each element is a data.table containing the validation results for the corresponding interpolated SpatRaster. Each data.table incluye métricas de bondad de ajuste como RMSE, MAE y Kling-Gupta Efficiency (KGE), junto con métricas categóricas si se proporciona Rain_threshold.

The number of elements in both the Ensamble and Validation lists matches the length of search_radius, ensuring that each interpolation result has an associated validation dataset.

Arguments

BD_Obs

A data.table or data.frame containing observational data with the following structure:

The first column (Date): A Date object representing the observation date.
The remaining columns: Each column corresponds to a unique ground station, where the column name is the station identifier.

The dataset should be structured as follows:

> BD_Obs
# A data.table or data.frame with n rows (dates) and m+1 columns (stations + Date)
   Date        ST001  ST002  ST003  ST004  ...
   <date>      <dbl>  <dbl>  <dbl>  <dbl>  ...
1  2015-01-01    0      0      0      0    ...
2  2015-01-02    0      0      0     0.2   ...
3  2015-01-03   0.1     0      0     0.1   ...

Each station column contains numeric values representing observed measurements.
The column names (station identifiers) must be unique and match those in BD_Coord$Cod to ensure proper spatial referencing.

BD_Coord

A data.table or data.frame containing the metadata of the ground stations. It must include the following columns:

"Cod": Unique identifier for each ground station.
"X": Latitude of the station in UTM format.
"Y": Longitude of the station in UTM format.

shapefile

A shapefile defining the study area, used to constrain the interpolation to the region of interest. The shapefile must be of class SpatVector (from the terra package) and should have a UTM coordinate reference system.

grid_resolution

A numeric value indicating the resolution of the interpolation grid in kilometers (km).

search_radius

A numeric vector indicating the search radius in kilometers (km) for the Cressman method. Note: See the "Notes" section for additional details on how to search radius values.

training

Numerical value between 0 and 1 indicating the proportion of data used for model training. The remaining data are used for validation. Note that if you enter, for example, 0.8 it means that 80 % of the data will be used for training and 20 % for validation. If you do not want to perform validation, set training = 1. (Default training = 1).

stat_validation

A character vector specifying the names of the stations to be used for validation. This option should only be filled in when it is desired to manually enter the stations used for validation. If this parameter is NULL, and the formation is different from 1, a validation will be performed using random stations. The vector must contain the names of the stations selected by the user for validation. For example, stat_validation = c(“ST001”, “ST002”). (Default stat_validation = NULL).

Rain_threshold

List of numerical vectors defining precipitation thresholds to classify precipitation into different categories according to its intensity. This parameter should be entered only when the validation is to include categorical metrics such as Critical Success Index (CSI), Probability of Detection (POD), False Alarm Rate (FAR), etc. Each list item should represent a category, with the category name as the list item name and a numeric vector specifying the lower and upper bounds of that category. Note: See the "Notes" section for additional details on how to define categories, use this parameter for validation, and example configurations.

save_model

Logical value indicating whether the interpolation file should be saved to disk. The default value is FALSE. indicating that the interpolated file should not be saved.

Author

Jonnathan Landi jonnathan.landi@outlook.com

Details

The Cressman method is defined by the following equation: $$Z_{ij}^a = Z_{ij}^b + \frac{\sum_{k=1}^{n} w_k (Z_k^o - Z_k^b)}{\sum_{k=1}^{n} w_k}$$ where:

$Z_{ij}^a$: is the analysis value at grid point $i,j$.
$Z_{ij}^b$: is the background value at grid point $i,j$.
$Z_k^o$: is the observed value at station $k$.
$Z_k^b$: is the background value interpolated to station $k$.
$w_k$: is the weight assigned to station $k$.
$n$: is the total number of stations used.

The weight $w_k$ is a function of the distance $r = \sqrt{(x_{ij} - x_k)^2 + (y_{ij} - y_k)^2}$ between the individual observation $k$ and grid point $(i, j)$. $R$ is the influence radius. Beyond the influence radius, the weight is set to zero. $R$ is therefore often referred to as the cut-off radius.

References

Cressman, G. P., 1959: An operational objective analysis system. Mon. Wea. Rev., 87, 367-374, doi:10.1175/1520-0493(1959)087%3C0367:AOOAS%3E2.0.CO;2.

Examples

Run this code

# \donttest{
library(InterpolateR)
# Load data from on-site observations
 data("BD_Obs", package = "InterpolateR")
 data("BD_Coord", package = "InterpolateR")

# Load the study area where the interpolation is performed.
 shapefile <- terra::vect(system.file("extdata/study_area.shp", package = "InterpolateR"))

 # Perform the interpolation
 Interpolated_Cressman <- Cressman(BD_Obs, BD_Coord, shapefile, grid_resolution = 5,
                                   search_radius = c(20,10), training = 1,
                                   stat_validation = "M001", Rain_threshold = NULL,
                                   save_model = FALSE)
# Results
Radius_20 = Interpolated_Cressman$Ensamble[[1]] # Interpolated data with a 20 km radius
Radius_10 = Interpolated_Cressman$Ensamble[[2]] # Interpolated data with a 10 km radius

# Validation statistics
# Validation results with a 20 km radius
Validation_results_20 = Interpolated_Cressman$Validation[[1]]
# Validation results with a 10 km radius
Validation_results_10 = Interpolated_Cressman$Validation[[2]]
# }

Run the code above in your browser using DataLab