Use this function to find all biomarkers across many
performance classification group matchings based on a given threshold between
0 and 1. The logic behind the biomarker selection is that if there is at
least one value in a column of
the diff.mat
matrix that surpasses the threshold given, then the
corresponding node (name of the column) is returned as a biomarker. This means
that for a single node, if at least one value that represents an average data
difference (for example, the average activity state difference) between any
of the given classification group comparisons is above (below) the threshold
(negative threshold), then a positive (negative) biomarker is
reported.
get_biomarkers(diff.mat, threshold)
a matrix whose rows are vectors of average node data
differences between two groups of models based on some kind of classification
(e.g. number of TP predictions) and whose names are set in the rownames
attribute of the matrix (usually denoting the different classification
groups, e.g. (1,2) means the models that predicted 1 TP synergy vs the models
that predicted 2 TP synergies, if the classification is done by number of TP
predictions). The columns represent the network's node names.
numeric. A number in the [0,1] interval, above which (or below its negative value) a biomarker will be registered in the returned result. Values closer to 1 translate to a more strict threshold and thus less biomarkers are found.
a list with two elements:
biomarkers.pos
: a character vector that includes the node
names of the positive biomarkers
biomarkers.neg
: a character vector that includes the node
names of the negative biomarkers
This function uses the get_biomarkers_per_type
function
to get the biomarkers (nodes) of both types (positive and negative) from the
average data differences matrix. If a node though is found to surpass the
significance threshold
level given both negatively and positively, we will keep it as a biomarker
in the category which corresponds to the comparison of the highest
classification groups. For example, if the data comes from a model performance
classification based on the MCC score and in the comparison of the MCC classes
(1,3) the node of interest had an average difference of -0.89 (a negative
biomarker) while for the comparison of the (3,4) MCC classes it had a value
of 0.91 (a positive biomarker), then we will keep that node only as a
positive biomarker. The logic behind this is that
the 'higher' performance-wise are the classification groups that we compare,
the more sure we are that the average data difference corresponds to a
better indicator for the type of the biomarker found.
Other biomarker functions: get_biomarkers_per_type