load_ppm: Load predictive performance metric (PPM) rasters

Description

eBird Status models are evaluated against a test set of eBird data not used during model training and a suite of predictive performance metrics (PPMs) are calculated. The PPMs for each base model are summarized to a 27 km resolution raster grid, where the cell values are the average across all models in the ensemble contributing to that cell. These data are available in raster format provided download_ppms = TRUE was used when calling ebirdst_download_status().

Usage

load_ppm(
  species,
  ppm = c("binary_f1", "binary_mcc", "binary_prevalence", "occ_bernoulli_dev",
    "occ_bin_spearman", "occ_brier", "occ_pr_auc", "occ_pr_auc_gt_prev",
    "occ_pr_auc_normalized", "count_log_pearson", "count_mae", "count_poisson_dev",
    "count_rmse", "count_spearman", "abd_log_pearson", "abd_mae", "abd_poisson_dev",
    "abd_rmse", "abd_spearman"),
  path = ebirdst_data_dir()
)

Value

A SpatRaster object with the PPM data. For migrants, rasters are weekly with 52 layers, where the layer names are the dates (MM-DD format) of the midpoint of each week. For residents, a single year round layer is returned.

Arguments

species: character; the species to load data for, given as a scientific name, common name or six-letter species code (e.g. "woothr"). The full list of valid species is in the ebirdst_runs data frame included in this package. To download the example dataset, use "yebsap-example".
ppm: character; the name of a single metric to load data for. See Details for definitions of each metric.
path: character; directory to download the data to. All downloaded files will be placed in a sub-directory of this directory named for the data version year, e.g. "2020" for the 2020 Status Data Products. Each species' data package will then appear in a directory named with the eBird species code. Defaults to a persistent data directory, which can be found by calling ebirdst_data_dir().

Details

Nineteen predictive performance metrics are provided:

binary_f1: F1-score comparing the model predictions converted to binary with the observed detection/non-detection for the test checklists.
binary_mcc: Matthews Correlation Coefficient (MCC) comparing the model predictions converted to binary with the observed detection/non-detection for the test checklists.
binary_prevalence: the observed detection probability after spatiotemporal subsampling.
occ_bernoulli_dev: proportion of Bernoulli deviance explained comparing the predicted occurrence with the observed detection/non-detection for the test checklists.
occ_bin_spearman: test observations are binned by predicted encounter rate with bin widths of 0.05, then the mean observed prevalence and predicted encounter rate are calculated within bins. This metric is the Spearman's rank correlation coefficient comparing the observed and predicted binned mean values.
occ_brier: the Brier score is the mean squared difference between predicted encounter rate and observed detection/non-detection.
occ_pr_auc: the area on the precision-recall curve (PR AUC) generated by comparing the predicted encounter rate with the observed detection/non-detection for the test checklists.
occ_pr_auc_gt_prev: the proportion of the ensemble for which the PR AUC is greater than observed prevalence, which indicates that the model is performing better than random guessing.
occ_pr_auc_normalized: the PR AUC normalized to account for class imbalance so that a value of 0 represents performance equal to random guessing and a value of 1 represents perfect classification.
count_log_pearson: Pearson correlation coefficient comparing the logarithm of the predicted count with the logarithm of the observed count for the subset of test checklists on which the species was detected.
count_mae: the mean absolute error (MAE) comparing the observed and predicted counts for the subset of test checklists on which the species was detected.
count_poisson_dev: proportion of Poisson deviance explained, comparing the observed and predicted counts for the subset of test checklists on which the species was detected.
count_rmse: route mean squared error (RMSE) comparing the observed and predicted counts for the subset of test checklists on which the species was detected.
count_spearman: Spearman's rank correlation coefficient comparing the observed and predicted counts for the subset of test checklists on which the species was detected.
abd_log_pearson: Pearson correlation coefficient comparing the logarithm of the predicted relative abundance with the logarithm of the observed count for the full set of test checklists.
abd_mae: the mean absolute error (MAE) comparing the observed counts and predicted relative abundance for the full set of test checklists.
abd_poisson_dev: proportion of Poisson deviance explained, comparing the predicted relative abundance with the observed count for the full set of test checklists.
abd_rmse: root mean squared error comparing the predicted relative abundance with the observed count for the full set of test checklists.
abd_spearman: Spearman's rank correlation coefficient comparing the predicted relative abundance with the observed count for the full set of test checklists.

Examples

Run this code

if (FALSE) {
# download example data if hasn't already been downloaded
ebirdst_download_status("yebsap-example", download_ppms = TRUE)

# load area under the precision-recall curve PPM raster
load_ppm("yebsap-example", ppm = "binary_pr_auc")
}

Run the code above in your browser using DataLab