Learn R Programming

HEMDAG (version 2.1.3)

find.best.f: Best hierarchical F-score

Description

Function to select the best hierarchical F-score by choosing an appropriate threshold in the scores

Usage

find.best.f(target, pred, n.round = 3, f.criterion = "F", verbose = TRUE,
  b.per.example = FALSE)

Arguments

target

matrix with the target multilabels: rows correspond to examples and columns to classes. \(target[i,j]=1\) if example \(i\) belongs to class \(j\), \(target[i,j]=0\) otherwise

pred

a numeric matrix with continuous predicted values (scores): rows correspond to examples and columns to classes

n.round

number of rounding digits to be applied to pred (default=3)

f.criterion

character. Type of F-measure to be used to select the best F-score. There are two possibilities:

  1. F (def.) corresponds to the harmonic mean between the average precision and recall;

  2. avF corresponds to the per-example F-score averaged across all the examples.

verbose

boolean. If TRUE (def.) the number of iterations are printed on stdout

b.per.example

boolean.

  • TRUE: results are returned for each example;

  • FALSE: only the average results are returned

Value

Two different outputs respect to the input parameter b.per.example:

  • b.per.example==FALSE: a list with a single element average. A named vector with 7 elements relative to the best result in terms of the F.measure: Precision (P), Recall (R), Specificity (S), F.measure (F), av.F.measure (av.F), Accuracy (A) and the best selected Threshold (T). F is the F-measure computed as the harmonic mean between the average precision and recall; av.F is the F-measure computed as the average across examples and T is the best selected threshold;

  • b.per.example==FALSE: a list with two elements:

    1. average: a named vector with with 7 elements relative to the best result in terms of the F.measure: Precision (P), Recall (R), Specificity (S), F.measure (F), av.F.measure (av.F), Accuracy (A) and the best selected Threshold (T).

    2. per.example: a named matrix with the Precision (P), Recall (R), Specificity (S), Accuracy (A), F-measure (F), av.F-measure (av.F) and the best selected Threshold (T) for each example. Row names correspond to examples, column names correspond respectively to Precision (P), Recall (R), Specificity (S), Accuracy (A), F-measure (F), av.F-measure (av.F) and the best selected Threshold (T).

Details

All the examples having no positive annotations are discarded. The predicted scores matrix (pred) is rounded according to parameter n.round and all the values of pred are divided by max(pred). Then all the thresholds corresponding to all the different values included in pred are attempted, and the threshold leading to the maximum F-measure is selected.

Names of rows and columns of target and pred matrix must be provided in the same order, otherwise a stop message is returned

Examples

Run this code
# NOT RUN {
data(graph);
data(labels);
data(scores);
root <- root.node(g);
L <- L[,-which(colnames(L)==root)];
S <- S[,-which(colnames(S)==root)];
FMM <- find.best.f(L, S, n.round=3, f.criterion="F", verbose=TRUE, b.per.example=TRUE);
# }

Run the code above in your browser using DataLab