evaluate: Model Evaluation

Description

Functions for evaluating model performance using comprehensive metrics. The package provides both generic S3 methods and direct function calls for model evaluation.

Usage

evaluate(object, ...)
# S3 method for SCA
evaluate(object, Testing_data, Predictant, digits = 3, ...)
# S3 method for SCE
evaluate(object, Testing_data, Training_data, Predictant, digits = 3, ...)
SCA_Model_evaluation(Testing_data, Simulations, Predictant, digits = 3)
SCE_Model_evaluation(Testing_data, Training_data, Simulations, Predictant, digits = 3)

Value

For SCA models and SCA_Model_evaluation:

If single predictant: Returns a data.frame with column "Testing"
If multiple predictants: Returns a list of data.frames, one for each predictant

For SCE models and SCE_Model_evaluation:

If single predictant: Returns a data.frame with columns "Training", "Validation", and "Testing"
If multiple predictants: Returns a list of data.frames, one for each predictant

Each data.frame contains the following metrics:

MAE: Mean Absolute Error (mean(abs(obs - sim)))
RMSE: Root Mean Square Error (sqrt(mean((obs - sim)^2)))
NSE: Nash-Sutcliffe Efficiency (1 - (sum((obs - sim)^2) / sum((obs - mean(obs))^2)))
Log.NSE: NSE calculated on log-transformed values
R2: R-squared calculated using linear regression
kge: Kling-Gupta Efficiency (1 - sqrt((r-1)^2 + (alpha-1)^2 + (beta-1)^2))

Arguments

object

An object for which performance should be evaluated.

Testing_data

A data.frame containing the observations used during model testing. Must include all specified predictants.

Training_data

A data.frame containing the observations used during model training. Required only for SCE objects and SCE_Model_evaluation.

Simulations

A list containing model predictions:

For SCE: must contain 'Training', 'Validation', and 'Testing' components
For SCA: must contain 'Testing_sim' component

The structure should align with the output generated by the respective model training function.

Predictant

A character vector specifying the name(s) of the dependent variable(s) to be evaluated (e.g., c("swvl3", "swvl4")). The specified names must exactly match those used in model training.

digits

An integer specifying the number of decimal places to retain when reporting evaluation metrics. Default value is 3.

...

Additional arguments passed to methods.

Author

Kailong Li <lkl98509509@gmail.com>

Details

Evaluation Metrics:

The functions evaluate model performance using six distinct metrics:

MAE (Mean Absolute Error): Average absolute difference between observed and predicted values
RMSE (Root Mean Square Error): Square root of the average squared differences
NSE (Nash-Sutcliffe Efficiency): Measures the relative magnitude of residual variance compared to observed variance
Log.NSE: NSE calculated on log-transformed values for better handling of skewed distributions
R2 (R-squared): Coefficient of determination from linear regression
KGE (Kling-Gupta Efficiency): Combines correlation, bias, and variability ratio

Function Differences:

evaluate.SCA(): S3 method for single SCA trees (calls SCA_Model_evaluation)
evaluate.SCE(): S3 method for SCE ensembles (calls SCE_Model_evaluation)
SCA_Model_evaluation(): Direct function for SCA model evaluation
SCE_Model_evaluation(): Direct function for SCE model evaluation

Input Validation: The functions perform comprehensive input validation:

Data frame structure validation
Presence of required components in Simulations list
Existence of predictants in both data and simulations
Matching row counts between data and simulations
Proper handling of NaN values and zero/negative values

Data Processing: