graph_two_GOspecies is a function to create undirected graphs
The graph_two_GOspecies is an analog of the graphGOspecies function, and it has the same options (" Categories " and " GO "). Nevertheless, the way in which the edge and node weights are calculated is slightly different. Since two species are compared, three possible graphs are available G_1,\, G_2, and G_3 . G_1, and G_2 represent each of the species analyzed and G_3 is a subgraph of G_1,\, G_2, which contains the GO terms or Categories co-ocurring between both species.
Categories option: (Weight): The nodes (V) represent groups of gene lists (categories), and the edges (E) represent GO terms co-occurring between pairs of categories and the weight of the nodes provides a measure of how a GO term is conserved between two species and a series of categories but it is biased to categories.
K_w(u)=_v V_1^w(u,v) + _v V_2^w(u,v) (5)
(shared weight): The nodes (V) represent groups of gene lists (categories), and the edges (E) represent GO terms co-occurring between pairs of categories that are only shared between species. This node weight K_s is computed from a shared weight of edges s, where N1 and N2 are the set of GO terms associated with the edge e = (u,v) for species 1 and 2, respectively. Therefore the node shared weight K_s(u) is the sum of s.
s(e) = N1 \ n \ N2 N1 N2 (6)
K_s(u)=_v (V_1 V_2) ^s(u,v) (7)
(combined weight): This node weight K_c(u) is a combination of the weight and the shared weight. The idea of this combined weight is to find categories with more frequent GO terms co-ocurring in order to observe functional similarities between two species with a balance of GO terms co-occurring among gene lists (categories) and the two species. This node weight varies from -1 (categories with GO terms found only in one species and few categories) to 1 (categories with GO terms shared widely between species and among other categories). the combined node weight K_c is defined as the sum of the min-max normalized weights K_w and K_s minus 1.
minmax(y)=y-min(y)max(y)-min(y) (8) K_c(u)= minmax(K_w(u)) + minmax(K_s(u)) - 1 (9)
GO option: Given there are three possible graphs are available G_1,\, G_2, and G_3. G_1, and G_2 represent each of the species analyzed and G_3 is a subgraph of G_1,\, G_2, which contains the GO terms or Categories co-ocurring between both species. For this case, Nodes are GO terms and edges are categories where a GO terms is co-ocurring. This weight is similar to the GO weight calculated for graphGOspecies function. it is calculated as the equation 5.
K_w(u)=_v V_1^w(u,v) + _v V_2^w(u,v) (5)
graph_two_GOspecies(
x,
species1,
species2,
GOterm_field,
saveGraph = FALSE,
option = "Categories",
numCores = 2,
outdir = NULL,
filename = NULL
)This function will return a list with two slots: edges and nodes. (Categories): Edges list columns:
| Column | Description |
| SOURCE and TARGET | The source and target categories (Nodes in the edge) |
| GO_N | The number of GO terms between the categories |
| WEIGHT | Edge weight |
| GO | GO terms available for both nodes |
| SP1 | Number of GO terms for the species 1 |
| SP2 | Number of GO terms for the species 2 |
| SHARED | Number of GO terms shared or co-ocurring between the categories |
| SHARED_WEIGHT | Shared weight for the edge |
Node list columns:
| Column | Description |
| CAT | Category name |
| CAT_WEIGHT | Node weight |
| SHARED_WEIGHT | Shared weight for the node |
| COMBINED_WEIGHT | Combined weight for the node |
(GO):
Edges list columns:
| Column | Description |
| SOURCE and TARGET | The source and target GO terms (Nodes in the edge) |
| FEATURE | The number of Categories where both GO Terms were found |
| SP | Species where the GO terms was found (Species 1, Species 2 or Shared) |
| WEIGHT | Edge weight |
Node list columns:
| Column | Description |
| GO | GO term node name |
| GO_WEIGHT | Node weight |
is a list obtained as output of the comparegOspecies function
This is a string with the species name for species 1 (e.g; "H. sapiens")
This is a string with the species name for species 2 (e.g; "A. thaliana")
This is a string with the column name of the GO terms (e.g; "Functional_Category")
logical, if TRUE the function will allow save the graph in graphml format
(values: "Categories or "GO"). This option allows create either a graph where nodes are GO terms and edges are features and GO as well as species belonging are edges attributes or a graph where nodes are GO terms and edges are species belonging (default value="Categories")
numeric, Number of cores to use for the process (default value numCores=2). For the example below, only one core will be used
This parameter will allow save the graph file in a folder described here (e.g: "D:").This parameter only works when saveGraph=TRUE
The name of the graph filename to be saved in the outdir detailed by the user.This parameter only works when saveGraph=TRUE
GOterm_field <- "Functional_Category"
data(comparison_ex_compress_CH)
#Defining the species names
species1 <- "H. sapiens"
species2 <- "A. thaliana"
x_graph <- graph_two_GOspecies(x=comparison_ex_compress_CH,
species1=species1,
species2=species2,
GOterm_field=GOterm_field,
numCores=1,
saveGraph = FALSE,
option= "Categories",
outdir = NULL,
filename= NULL)
Run the code above in your browser using DataLab