Learn R Programming

HEMDAG (version 2.1.3)

TPR-DAG: TPR-DAG variants

Description

Different variants of the TPR-DAG algorithm are implemented. In their more general form the TPR-DAG algorithms adopt a two step learnig strategy:

  1. in the first step they compute a per-level bottom-up visit from the leaves to the root to propagate positive predictions across the hierarchy;

  2. in the second step they compute a per-level top-down visit from the root to the leaves in order to assure the hierarchical consistency of the predictions

Usage

tpr.threshold(S, g, root = "00", t = 0.5)

tpr.threshold.free(S, g, root = "00")

tpr.weighted.threshold.free(S, g, root = "00", w = 0.5)

tpr.weighted.threshold(S, g, root = "00", t = 0.5, w = 0.5)

Arguments

S

a named flat scores matrix with examples on rows and classes on columns

g

a graph of class graphNEL. It represents the hierarchy of the classes

root

name of the class that it is on the top-level of the hierarchy (def. root="00")

t

threshold for the choice of positive children (def. t=0.5)

w

weight to balance between the contribution of the node \(i\) and that of its positive children

Value

a named matrix with the scores of the classes corrected according to the TPR-DAG algorithm.

Details

The vanilla TPR-DAG adopts a per-level bottom-up traversal of the DAG to correct the flat predictions \(\hat{y}_i\): $$ \bar{y}_i := \frac{1}{1 + |\phi_i|} (\hat{y}_i + \sum_{j \in \phi_i} \bar{y}_j) $$ where \(\phi_i\) are the positive children of \(i\). Different strategies to select the positive children \(\phi_i\) can be applied:

  1. Threshold-Free strategy: the positive nodes are those children that can increment the score of the node \(i\), that is those nodes that achieve a score higher than that of their parents: $$ \phi_i := \{ j \in child(i) | \bar{y}_j > \hat{y}_i \} $$

  2. Threshold strategy: the positive children are selected on the basis of a threshold that can ben selected in two different ways:

    1. for each node a constant threshold \(\bar{t}\) is a priori selected: $$ \phi_i := \{ j \in child(i) | \bar{y}_j > \bar{t} \} $$ For instance if the predictions represent probabilities it could be meaningful to a priori select \(\bar{t}=0.5\).

    2. the threshold is selected to maximize some performance metric \(\mathcal{M}\) estimated on the training data, as for instance the F-score or the AUPRC. In other words the threshold is selected to maximize some measure of accuracy of the predictions \(\mathcal{M}(j,t)\) on the training data for the class \(j\) with respect to the threshold \(t\). The corresponding set of positives \(\forall i \in V\) is: $$ \phi_i := \{ j \in child(i) | \bar{y}_j > t_j^*, t_j^* = \arg \max_{t} \mathcal{M}(j,t) \} $$ For instance \(t_j^*\) can be selected from a set of \(t \in (0,1)\) through internal cross-validation techniques.

The weighted TPR-DAG version can be designed by adding a weight \(w \in [0,1]\) to balance between the contribution of the node \(i\) and that of its positive children \(\phi\), through their convex combination: $$ \bar{y}_i := w \hat{y}_i + \frac{(1 - w)}{|\phi_i|} \sum_{j \in \phi_i} \bar{y}_j $$ If \(w=1\) no weight is attributed to the children and the TPR-DAG reduces to the HTD-DAG algorithm, since in this way only the prediction for node \(i\) is used in the bottom-up step of the algorithm. If \(w=0\) only the predictors associated to the children nodes vote to predict node \(i\). In the intermediate cases we attribute more importance to the predictor for the node \(i\) or to its children depending on the values of \(w\).

Examples

Run this code
# NOT RUN {
data(graph);
data(scores);
data(labels);
root <- root.node(g);
S.tprTF <- tpr.threshold.free(S,g,root);
S.tprT <- tpr.threshold(S,g,root,t=0.5);
S.tprW <- tpr.weighted.threshold.free(S,g,root,w=0.5);
S.tprWT <- tpr.weighted.threshold(S,g,root,w=0.5, t=0.5);
# }

Run the code above in your browser using DataLab