knobi_env: KBPM environmental analysis

Description

Analyze and model the relationships between surplus production (SP) and environmental covariable(s) to test whether productivity changes in response to environmental fluctuations. The analysis is conducted in three steps:

(1) Correlation Analysis: Assess the correlation between the standardized environmental variable(s), at different delays (lags), and the KBPM residuals using Pearson's correlation or autoregressive models. See details.

(2) Variable Selection: Determine which lagged environmental variable(s) will be included in the environmental KBPM models.

(3) Environmental Fitting: Fit the KBPM model incorporating environmental effects as both additive and multiplicative effects (see details).

Value

A list containing the results of the three-step environmental analysis:

add: estimates of the additive model parameters.
mult: estimates of the multiplicative model parameters.
BRPs: reference points (BRPs) estimates for each model (see details).
df: data frame with the information used in the fit.
selected_var: environmental variable(s) used in the fit.
selected_lag: data frame providing the time lag of the environmental variable(s) in the KBPM fit.
lag_cor: correlation between the environmental variable(s) and the KBPM residuals for each time lag.
env_aic: if ar_cor = TRUE, AIC values of each autoregressive model (see details).
scaled_var: standardized environmental variable(s) used in the fit.
plots3D: list with 3D plot objects (if plot3d = TRUE).
residuals: Pearson residuals from each model fit (base, additive, and multiplicative).
performance_metrics: array of performance and accuracy measures for each model, including those from the error_table output of knobi_fit, plus an F-test comparing environmental models to the base model.

The function also returns plots in the graphics window and saves them (if plot_out = TRUE) to the specified directory. The first plot shows the correlation analysis between the environmental variable(s) and the base residuals. The second shows fitted SP values from all models. If multicovar = FALSE and plot3d = TRUE, 3D plots are also returned. If multicovar = TRUE, a plot with Pearson correlations among environmental variables is included.

Arguments

knobi_results

The output object of knobi_fit function (main package function).

data

A list containing the following data:

env: containing the values of each environmental variable, with each column representing a different variable and each row representing a year.
years: years in which the environmental variable(s) are reported.

control

Optional. List containing the following settings:

nlag: this argument specifies the maximum lag of the environmental variable to test in the correlation analysis, meaning that lags less than or equal to nlag (a natural number) are evaluated. The correlation between $residuals_t$ and $X_{t - lag}$ is computed, where $X$ is the environmental variable and $lag$ takes values from the sequence $\{0, 1, \ldots, nlag\}$. The lagged environmental variable corresponding to the highest correlation with the KBPM residuals is included in the environmental model. By default, nlag = 3. See details.
lag: an optional numerical vector specifying the lag value(s) to consider in the relationship between the KBPM surplus production and the environmental variable(s). The length of this vector must match the number of environmental variables included. This argument applies only if the nlag argument is not provided.
start_c: optional. A numerical vector specifying the starting values for the environmental parameter $c$ in the additive and multiplicative models, respectively. By default, start_c = c(1, 1). See details.
ar_cor: optional. Logical. By default, this argument is FALSE, meaning the correlation between the KBPM residuals and the environmental variable(s) is analyzed using the Pearson correlation measure. If set to TRUE, the relationship is instead analyzed by fitting autoregressive (AR) models, with each lagged environmental variable included as an explanatory covariate for KBPM residuals. The environmental variable associated with the model that has the lowest Akaike Information Criterion (AIC) is selected for inclusion in the environmental KBPM fit. See details.
plot3d: optional. Logical. If set to TRUE, 3D plots are generated, displaying the surplus production curve across a range of values for the environmental variable(s). The default is FALSE.
multicovar: optional. Logical. If TRUE, the environmental model incorporates all input environmental covariates simultaneously. By default, this argument is FALSE, meaning that only the environmental variable with the highest correlation (after lagging, if applicable) is included in the model.

plot_out

Logical. If set to TRUE, a file containing the plot of the environmental fits is created. The default value is taken from the input in the knobi_fit function.

plot_dir

Optional. Directory where the folder for saving the plots will be created. Required when plot_out = TRUE. The default value is taken from the input in the knobi_fit function.

plot_filename

Optional. Name of the folder that will contain the plots. Required when plot_out = TRUE. The default value is taken from the input in the knobi_fit function.

Author

Anxo Paz
Marta Cousido-Rocha
Santiago Cerviño López
M. Grazia Pennino

Details

It is important to mention that the environmental variable(s), in a first step, are standardized, in order to make their scale and magnitude comparable. To do this, each variable is subtracted from its mean and divided by its standard deviation.

The additive environmental model adds the following term on the right-hand side of Eq. (1) or Eq. (2) described in knobi_fit: $c X_{t - lag} B_t$, being $X_{t - lag}$ the environmental variable at time $t - lag$ and $B_t$ the biomass or SSB at time $t$.

The multiplicative environmental model multiplies the right-hand side of Eq. (1) or Eq. (2) by $\exp(c X_{t - lag})$.

In the case of these models, the estimated biological reference points (BRPs) correspond to a value of the scaled environmental variable equal to the mean of the time series, i.e., $X_t = 0$, which cancels out the effect of the parameter $c$. The estimates of the remaining parameters included in Eq. (1) or Eq. (2), and therefore for the BRPs as well, will differ from the base model due to the inclusion of environmental effects. For more details, such as the calculation of BRPs as a function of the environmental variable, see the vignettes.

If ar_cor = TRUE, the correlation analysis between the knobi_fit residuals and the environmental variable(s) is conducted as follows:

First, an AR model is fitted to the KBPM base residuals: $$r_t = \sum_{i = 1}^{\rho} \beta_i r_{t - i} + \epsilon_t$$ where $r_t$ is the KBPM base residual for year $t$, and $\rho$ is the AR model order, estimated as the maximum lag at which the absolute value of the residuals' partial autocorrelation exceeds $qnorm(0.975) / \sqrt{N_r}$, with $N_r$ being the length of the residuals series.

AR models are then fitted to the residuals incorporating each lagged environmental variable $X_{t - lag}$ as an explanatory covariate: $$r_t = \sum_{i = 1}^{\rho} \beta_i r_{t - i} + X_{t - lag} + \epsilon_t$$ for $lag = 0, 1, \ldots, nlag$. Then, we have one autoregressive model for each lag.

The lagged environmental variable whose AR model has the lowest AIC is selected for inclusion in the environmental KBPM fit. If none improve the AIC compared to the base AR model, no environmental covariate is added.

It is important to highlight that the results include the analysis of the AR model only with the base model residuals in order to determine the need for coupling environmental information, considering that it would not be necessary if this model shows a lower AIC, even reducing the number of parameters to fit.

Examples

Run this code


# \donttest{

# First, run the example of knobi_fit function

# Then, provide environmental data series

Env <- knobi_dataset$Env

# The environmental data series must start in the first year of the KBPM fit data
# minus the provided nlag or lag arguments
years <- knobi_results$df$Year # See knobi_fit example to obtain the knobi_results object
ind <- which(Env[,1]==years[1])
ind1 <- which(Env[,1]==years[length(years)])
nlag <- 5
Env <- Env[(ind-nlag):ind1,]

# Now we create the environmental list
data <- list(env=data.frame(AMO=Env$AMO,NAO=Env$NAO),
           years=Env$years)
control <- list(nlag=nlag)

knobi_environmental <- knobi_env(knobi_results,data,control)
knobi_environmental
knobi_environmental$BRPs  # use the '$' to access to all the fit information
# }

Run the code above in your browser using DataLab