eBird Status models are evaluated against a test set of eBird data not used
during model training and a suite of predictive performance metrics (PPMs)
are calculated. The PPMs for each base model are summarized to a 27 km
resolution raster grid, where the cell values are the average across all
models in the ensemble contributing to that cell. These data are available in
raster format provided download_ppms = TRUE
was used when calling
ebirdst_download_status()
.
load_ppm(
species,
ppm = c("binary_f1", "binary_mcc", "binary_prevalence", "occ_bernoulli_dev",
"occ_bin_spearman", "occ_brier", "occ_pr_auc", "occ_pr_auc_gt_prev",
"occ_pr_auc_normalized", "count_log_pearson", "count_mae", "count_poisson_dev",
"count_rmse", "count_spearman", "abd_log_pearson", "abd_mae", "abd_poisson_dev",
"abd_rmse", "abd_spearman"),
path = ebirdst_data_dir()
)
A SpatRaster object with the PPM data. For
migrants, rasters are weekly with 52 layers, where the layer names are the
dates (MM-DD
format) of the midpoint of each week. For residents, a
single year round layer is returned.
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use "yebsap-example"
.
character; the name of a single metric to load data for. See Details for definitions of each metric.
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling ebirdst_data_dir()
.
Nineteen predictive performance metrics are provided:
binary_f1
: F1-score comparing the model predictions converted to binary
with the observed detection/non-detection for the test checklists.
binary_mcc
: Matthews Correlation Coefficient (MCC) comparing the model
predictions converted to binary with the observed detection/non-detection
for the test checklists.
binary_prevalence
: the observed detection probability after
spatiotemporal subsampling.
occ_bernoulli_dev
: proportion of Bernoulli deviance explained comparing
the predicted occurrence with the observed detection/non-detection for the
test checklists.
occ_bin_spearman
: test observations are binned by predicted encounter
rate with bin widths of 0.05, then the mean observed prevalence and predicted
encounter rate are calculated within bins. This metric is the Spearman's rank
correlation coefficient comparing the observed and predicted binned mean
values.
occ_brier
: the Brier score is the mean squared difference between
predicted encounter rate and observed detection/non-detection.
occ_pr_auc
: the area on the precision-recall curve (PR AUC) generated by
comparing the predicted encounter rate with the observed
detection/non-detection for the test checklists.
occ_pr_auc_gt_prev
: the proportion of the ensemble for which the PR AUC
is greater than observed prevalence, which indicates that the model is
performing better than random guessing.
occ_pr_auc_normalized
: the PR AUC normalized to account for class
imbalance so that a value of 0 represents performance equal to random
guessing and a value of 1 represents perfect classification.
count_log_pearson
: Pearson correlation coefficient comparing the
logarithm of the predicted count with the logarithm of the observed count for
the subset of test checklists on which the species was detected.
count_mae
: the mean absolute error (MAE) comparing the observed and
predicted counts for the subset of test checklists on which the species was
detected.
count_poisson_dev
: proportion of Poisson deviance explained, comparing
the observed and predicted counts for the subset of test checklists on which
the species was detected.
count_rmse
: route mean squared error (RMSE) comparing the observed and
predicted counts for the subset of test checklists on which the species was
detected.
count_spearman
: Spearman's rank correlation coefficient comparing the
observed and predicted counts for the subset of test checklists on which the
species was detected.
abd_log_pearson
: Pearson correlation coefficient comparing the logarithm
of the predicted relative abundance with the logarithm of the observed
count for the full set of test checklists.
abd_mae
: the mean absolute error (MAE) comparing the observed counts and
predicted relative abundance for the full set of test checklists.
abd_poisson_dev
: proportion of Poisson deviance explained, comparing the
predicted relative abundance with the observed count for the full set of test
checklists.
abd_rmse
: root mean squared error comparing the predicted relative
abundance with the observed count for the full set of test checklists.
abd_spearman
: Spearman's rank correlation coefficient comparing the
predicted relative abundance with the observed count for the full set of
test checklists.
if (FALSE) {
# download example data if hasn't already been downloaded
ebirdst_download_status("yebsap-example", download_ppms = TRUE)
# load area under the precision-recall curve PPM raster
load_ppm("yebsap-example", ppm = "binary_pr_auc")
}
Run the code above in your browser using DataLab