GPAV: Generalized Pool-Adjacent Violators

Description

Implementation of GPAV (Generalized Pool-Adjacent Violators) algorithm. (Burdakov et al., In: Di Pillo G, Roma M, editors. An O(n2) Algorithm for Isotonic Regression. Boston, MA: Springer US; 2006. p. 25<U+2013>33. Available from: https://doi.org/10.1007/0-387-30065-1_3

Usage

GPAV(Y, W = NULL, adj)

Arguments

vector of scores relative to a single example. Y must be a numeric named vector, where names correspond to classes' names, i.e. nodes of the graph g (root node included).

vector of weight relative to a single example. If the vector W is not specified (def. W=NULL), it is assumed that W is a unitary vector of the same length of the vector Y.

adj

adjacency matrix of the graph which must be sparse, logical and upper triangular. Number of columns of adj must be equal to the length of Y and W.

Value

a list of 3 elements:

YFit: a named vector with the scores of the classes corrected according to the GPAV algorithm. NOTE: the classes of YFit are topologically sorted, that is are in the same order of those of adj.
blocks: list of vectors, containing the partitioning of nodes (represented with an integer number) into blocks;
W: vector of weights.

Details

Given the constraints adjacency matrix of the graph, a vector of scores $\hat{y} \in R^n$ and a vector of strictly positive weights $w \in R^n$, the GPAV algorithm returns a vector $\bar{y}$ which is as close as possible, in the least-squares sense, to the response vector $\hat{y}$ and whose components are partially ordered in accordance with the constraints matrix adj. In other words, GPAV solves the following problem: $$ \bar{y} = \left\{ \begin{array}{l} \min \sum_{i \in N} (\hat{y}_i - \bar{y}_i )^2\\\\ \forall i, \quad j \in par(i) \Rightarrow \bar{y}_j \geq \bar{y}_i \end{array} \right. $$

Examples

Run this code

# NOT RUN {
data(graph);
data(scores);
Y <- S[3,];
adj <- adj.upper.tri(g);
S.GPAV <- GPAV(Y,W=NULL,adj);
# }