emba (version 0.1.1)

get_avg_activity_diff_based_on_mcc_clustering: Get the average activity difference based on MCC clustering

Description

This function splits the models to 'good' and 'bad' based on an MCC value clustering method: class.id.high denotes the group id with the higher MCC values (good model group) vs class.id.low which denotes the group id with the lower MCC values (bad model group). Then, for each network node, the function finds the node's average activity in each of the two classes (a value in the [0,1] interval) and then subtracts the bad class average activity value from the good one.

Usage

get_avg_activity_diff_based_on_mcc_clustering(models.mcc,
  models.stable.state, mcc.class.ids, models.cluster.ids, class.id.low,
  class.id.high)

Arguments

models.mcc

a numeric vector of Matthews Correlation Coefficient (MCC) scores, one for each model. The names attribute holds the models' names. Can be the result of using the function calculate_models_mcc.

models.stable.state

a matrix (nxm) with n models and m nodes. The row names of the matrix specify the models' names (same order as in the models.mcc parameter) whereas the column names specify the name of the network nodes (gene, proteins, etc.). Possible values for each model-node element are either 0 (inactive node) or 1 (active node).

mcc.class.ids

a numeric vector of group/class ids starting from NaN if models with NaN MCC score are included or 1 otherwise. E.g. c(1,2,3), where we have 3 MCC classes and no NaN values.

models.cluster.ids

a numeric vector of cluster ids assigned to each model. It is the result of using Ckmeans.1d.dp with input the sorted vector of the models' MCC values with no NaNs included.

class.id.low

integer. This number specifies the MCC class id of the 'bad' models.

class.id.high

integer. This number specifies the MCC class id of the 'good' models and needs to be strictly higher than class.id.low.

Value

a numeric vector with values in the [-1,1] interval (minimum and maximum possible average difference) and with the names attribute representing the name of the nodes.

Details

So, if a node has a value close to -1 it means that on average, this node is more inhibited in the 'good' models compared to the 'bad' ones while a value closer to 1 means that the node is more activated in the 'good' models. A value closer to 0 indicates that the activity of that node is not so much different between the 'good' and 'bad' models and so it won't not be a node of interest when searching for indicators of better performance (higher MCC score/class) in the good models.

See Also

Other average data difference functions: get_avg_activity_diff_based_on_specific_synergy_prediction, get_avg_activity_diff_based_on_synergy_set_cmp, get_avg_activity_diff_based_on_tp_predictions, get_avg_activity_diff_mat_based_on_mcc_clustering, get_avg_activity_diff_mat_based_on_specific_synergy_prediction, get_avg_activity_diff_mat_based_on_tp_predictions, get_avg_link_operator_diff_mat_based_on_mcc_clustering, get_avg_link_operator_diff_mat_based_on_specific_synergy_prediction, get_avg_link_operator_diff_mat_based_on_tp_predictions