nopa: Normalized Ordinal Prediction Agreement (NOPA)

Description

Compute the Normalized Ordinal Prediction Agreement (NOPA) metric, a performance measure for models with ordinal-scaled response variables that output estimated probability distributions (EPDs) instead of predicted labels.

This function assesses the predictive quality of a model for an ordinal response by aggregating the predicted probability mass as a function of the level of disagreement with respect to the observed category. It provides a normalized and interpretable score between 0 and 1, where 1 indicates perfect agreement and 0 represents the worst possible prediction.

NOPA compares the estimated probability distribution produced by a model for each unit of analysis against the observed ordinal response of the same unit. The maximum disagreement is \(k-1\), where \(k\) is the number of ordinal categories of the response variable, and the minimum disagreement is 0. Then, aggregates the disagreements of all units of analysis into one single measure.

The function internally computes:

OPD — Ordinal Prediction Disagreement, the average level of disagreement between the predicted and observed categories.
w — The worst possible OPD given the dataset, representing the maximum disagreement achievable.
NOPA — The normalized agreement metric defined as \(1 - OPD / w\).
OPDempDist, OPDur, NOPAempDist, NOPAur: Reference values for empirical and uniform-random baselines to contextualize model performance assessment provided by OPD and NOPA.

Usage

nopa(predMat, obsVect)

Value

A list containing:

predMat: Input matrix of predicted probabilities.
obsVect: Input vector of observed categories.
disagreementsObs: A matrix with \(k\) columns (number of ordinal categories of the response variable), and \(n\) rows. Each row shows the level of disagreement of each ordinal category with respect to the observed one for the same unit of analysis.
rearrangedProbObs: Matrix of probabilities aggregated by level of disagreement.
meanDistObs: Mean aggregated disagreement profile.
OPD: Observed Ordinal Prediction Disagreement.
w: OPD for the worst prediction possible (maximum disagreement).
NOPA: Normalized Ordinal Prediction Agreement (main metric).
OPDempDist: A version of a reference point for OPD. It considers an ordinal prediction disagreement measure for the case where the estimated probability distribution for the \(k\) categories of the ordinal response follows the same distribution as the empirical one.
OPDur: A version of a reference point for OPD. It considers an ordinal prediction disagreement measure for the case where the observed response variable has its own empirical distribution and the estimated probability distribution for the \(k\) categories of the ordinal response follows a uniform distribution.
NOPAempDist: A version of a reference point for NOPA. It considers a normalized ordinal prediction agreement measure for the case where the estimated probability distribution for the \(k\) categories of the ordinal response follows the same distribution as the empirical one.
NOPAur: A version of a reference point for NOPA. It considers a normalized ordinal prediction agreement measure for the case where the estimated probability distribution for the \(k\) categories of the ordinal response follows a uniform distribution.

Arguments

predMat: A numeric matrix with \(k\) columns and \(n\) rows, where \(k\) is the number of ordinal categories and \(n\) is the number of units of analysis. Each row must be the estimated probability distribution for the unit of analysis to respond each one of the \(k\) categories.
obsVect: A numeric or integer vector of observed categories, with values from 1 to \(k\), where \(k\) is the number of categories of the ordinal response variable (matching the number of columns in predMat).

References

Javier

Examples

Run this code

EPD <- t(apply(matrix(runif(100),ncol=5),1,function(y) y/sum(y)))
sum(rowSums(EPD))==nrow(EPD)
ordResponse <- sample(1:5,20, replace=TRUE)
nopa(predMat=EPD,obsVect=ordResponse)