Learn R Programming

CSDownscale (version 0.0.1)

Analogs: Downscaling using Analogs based on large scale fields.

Description

This function performs a downscaling using Analogs. To compute the analogs given a coarse-scale field, the function looks for days with similar conditions in the historical observations. The analogs function determines the N best analogs based on Euclidian distance, distance correlation, or Spearman's correlation metrics. To downscale a local-scale variable, either the variable itself or another large-scale variable can be utilized as the predictor. In the first scenario, analogs are examined between the observation and model data of the same local-scale variable. In the latter scenario, the function identifies the day in the observation data that closely resembles the large-scale pattern of interest in the model. When it identifies the date of the best analog, the function extracts the corresponding local-scale variable for that day from the observation of the local scale variable. The used local-scale and large-scale variables can be retrieved from independent regions. The input data for the first case must include 'exp' and 'obs,' while in the second case, 'obs,' 'obsL,' and 'exp' are the required input fields. Users can perform the downscaling process over the subregions that can be identified through the 'region' argument, instead of focusing on the entire area of the loaded data. The search of analogs must be done in the longest dataset posible, but might require high-memory computational resources. This is important since it is necessary to have a good representation of the possible states of the field in the past, and therefore, to get better analogs. The function can also look for analogs within a window of D days, but is the user who has to define that window. Otherwise, the function will look for analogs in the whole dataset. This function is intended to downscale climate prediction data (i.e., sub-seasonal, seasonal and decadal predictions) but can admit climate projections or reanalyses. It does not have constrains of specific region or variables to downscale.

Usage

Analogs(
  exp,
  obs,
  exp_lats = NULL,
  exp_lons = NULL,
  obs_lats,
  obs_lons,
  grid_exp,
  obsL = NULL,
  obsL_lats = NULL,
  obsL_lons = NULL,
  nanalogs = 3,
  fun_analog = NULL,
  lat_dim = "lat",
  lon_dim = "lon",
  sdate_dim = "sdate",
  time_dim = "time",
  member_dim = "member",
  metric = "dist",
  region = NULL,
  return_indices = FALSE,
  loocv_window = TRUE,
  ncores = NULL
)

Value

A list of three elements. 'data' contains the dowscaled field, 'lat' the downscaled latitudes, and 'lon' the downscaled longitudes. If fun_analog is set to NULL (default), the output array in 'data' also contains the dimension 'analog' with the best analog days.

Arguments

exp

an array with named dimensions containing the experimental field on the coarse scale for the variable targeted for downscaling (in case obsL is not provided) or for the large-scale variable used as the predictor (if obsL is provided). The object must have, at least, the dimensions latitude, longitude, start date and time. The object is expected to be already subset for the desired region. Data can be in one or two integrated regions, e.g., crossing the Greenwich meridian. To get the correct results in the latter case, the borders of the region should be specified in the parameter 'region'. See parameter 'region'. Also, the object can be either hindcast or forecast data. However, if forecast data is provided, the loocv_window parameter should be selected as FALSE.

obs

an array with named dimensions containing the observational field for the variable targeted for downscaling. The object must have, at least, the dimensions latitude, longitude, start date and either time or window. The object is expected to be already subset for the desired region. Optionally, 'obs' can have the dimension 'window', containing the sampled fields into which the function will look for the analogs. Otherwise, the function will look for analogs using all the possible fields contained in obs.

exp_lats

a numeric vector containing the latitude values in 'exp'. Latitudes must range from -90 to 90.

exp_lons

a numeric vector containing the longitude values in 'exp'. Longitudes can range from -180 to 180 or from 0 to 360.

obs_lats

a numeric vector containing the latitude values in 'obs'. Latitudes must range from -90 to 90.

obs_lons

a numeric vector containing the longitude values in 'obs'. Longitudes can range from -180 to 180 or from 0 to 360.

grid_exp

a character vector with a path to an example file of the exp data. It can be either a path to another NetCDF file which to read the target grid from (a single grid must be defined in such file) or a character vector indicating the coarse grid to be passed to CDO, and it must be a grid recognised by CDO.

obsL

an 's2dv_cube' object with named dimensions containing the observational field of the large-scale variable.The object must have, at least, the dimensions latitude, longitude, start date and either time or window. The object is expected to be already subset for the desired region. Optionally, 'obsL' can have the dimension 'window', containing the sampled fields into which the function will look for the analogs. Otherwise, the function will look for analogs using all the possible fields contained in obs.

obsL_lats

a numeric vector containing the latitude values in 'obsL'. Latitudes must range from -90 to 90.

obsL_lons

a numeric vector containing the longitude values in 'obsL'. Longitudes can range from -180 to 180 or from 0 to 360.

nanalogs

an integer indicating the number of analogs to be searched.

fun_analog

a function to be applied over the found analogs. Only these options are valid: "mean", "wmean", "max", "min", "median" or NULL. If set to NULL (default), the function returns the found analogs.

lat_dim

a character vector indicating the latitude dimension name in the element 'data' in exp and obs. Default set to "lat".

lon_dim

a character vector indicating the longitude dimension name in the element 'data' in exp and obs. Default set to "lon".

sdate_dim

a character vector indicating the start date dimension name in the element 'data' in exp and obs. Default set to "sdate".

time_dim

a character vector indicating the time dimension name in the element 'data' in exp and obs. Default set to "time".

member_dim

a character vector indicating the member dimension name in the element 'data' in exp and obs. Default set to "member".

metric

a character vector to select the analog specification method. Only these options are valid: "dist" (i.e., Euclidian distance), "dcor" (i.e., distance correlation) or "cor" (i.e., Spearman's .correlation). The default metric is "dist".

region

a numeric vector indicating the borders of the downscaling region. It consists of four elements in this order: lonmin, lonmax, latmin, latmax. lonmin refers to the left border, while lonmax refers to the right border. latmin indicates the lower border, whereas latmax indicates the upper border. If set to NULL (default), the function uses the full obs grid as the downscaling region.

return_indices

a logical vector indicating whether to return the indices of the analogs together with the downscaled fields. The indices refer to the position of the element in the vector time * start_date. If 'obs' contain the dimension 'window', it will refer to the position of the element in the dimension 'window'. Default to FALSE.

loocv_window

a logical vector only to be used if 'obs' does not have the dimension 'window'. It indicates whether to apply leave-one-out cross-validation in the creation of the window. It is recommended to be set to TRUE. Default to TRUE.

ncores

an integer indicating the number of cores to use in parallel computation. The default value is NULL.

Author

J. Ramon, jaumeramong@gmail.com

E. Duzenli, eren.duzenli@bsc.es

Ll. Lledó, llorenc.lledo@ecmwf.int

Examples

Run this code
exp <- rnorm(15000)
dim(exp) <- c(member = 5, lat = 4, lon = 5, sdate = 5, time  = 30)
exp_lons <- 1:5
exp_lats <- 1:4
obs <- rnorm(27000)
dim(obs) <- c(lat = 12, lon = 15, sdate = 5, time  = 30)
obs_lons <- seq(0,6, 6/14)
obs_lats <- seq(0,6, 6/11)
if (Sys.which("cdo") != "") {
downscaled_field <- Analogs(exp = exp, obs = obs, exp_lats = exp_lats, exp_lons = exp_lons, 
                           obs_lats = obs_lats, obs_lons = obs_lons, grid_exp = 'r360x180')
}

Run the code above in your browser using DataLab