graphGOspecies is a function to create undirected graphs using two options:
Categories option:
The nodes (V) represent groups of gene lists (categories), and the edges (E) represent GO terms co-occurring between pairs of categories. More specifically, Two categories: u,v V are connected by an edge e=(u,v).the edge weights w(e) are defined as the ratio of the number of GO terms co-occurring between two categories. Edge weights w(e) are defined as the ratio of the number of GO terms (e.g. biological processes) co-occurring between two categories BP_u \ n BP_v compared to the total number of GO terms available. A node weight K_w(u) is defined as the sum of the edge weights where the node u is a participant. Thus, the node weight represents how frequently GO terms are reported and expressed in a biological phenomenon.
w(e) = BP_u n BP_v BP (1)
K_w = _v Vw(u,v) (2)
GO option:
The nodes V represent GO terms and the edges E' represent categories where a pair of GO terms co-occur. More specifically, two GO terms are connected by an edge e'=(u,v'). the edge weight w'(e') corresponds to the number of categories co-occurring the GO terms u and v',compared with the total number of GO terms (Equation 3). A node weight K'_w(u') is defined,in this case the weight represents the importance of a GO term (more frequent co-occurring).(Please be patient, it requires a long time to finish).
w'(e')=Cu' Cv' BP (3)
K'_w(u')=_v' V'w'(u',v') (4)
graphGOspecies(
df,
GOterm_field,
option = "Categories",
numCores = 2,
saveGraph = FALSE,
outdir = NULL,
filename = NULL
)This function will return a list with two slots: edges and nodes.
(Categories): Edges list columns:
| Column | Description |
| SOURCE and TARGET | The source and target categories (Nodes in the edge) |
| FEATURES_N | The number of GO terms between the categories |
| WEIGHT | Edge weight |
| FEATURES | GO terms available for both nodes |
Node list columns:
| Column | Description |
| feature | Category name |
| GO_count | GO terms counts for the node |
| WEIGHT | Node weight |
(GO):
Edges list columns:
| Column | Description |
| SOURCE and TARGET | The source and target GO terms (Nodes in the edge) |
| FEATURE | The number of Categories where both GO Terms were found |
| WEIGHT | Edge weight |
Node list columns:
| Column | Description |
| GO | GO term node name |
| GO_WEIGHT | Node weight |
A data frame with the results of a functional enrichment analysis for a species with an extra column "feature" with the features to be compared
This is a string with the column name of the GO terms (e.g: "Functional.Category")
(values: "GO" or "Categories"). This option allows create either a graph where nodes are GO terms and edges are features or alternatively a graph where nodes are features and edges are GO terms (default value="Categories")
numeric, Number of cores to use for the process (default value numCores=2). For the example below, only one core will be used
logical, if TRUE the function will allow save the graph in graphml format
This parameter will allow save the graph file in a folder described here (e.g: "D:").This parameter only works when saveGraph=TRUE
The name of the graph filename to be saved in the outdir detailed by the user.This parameter only works when saveGraph=TRUE
#Loading example datasets
data(H_sapiens_compress)
GOterm_field <- "Functional_Category"
#Running function
x <- graphGOspecies(df=H_sapiens_compress,
GOterm_field=GOterm_field,
option = "Categories",
numCores=1,
saveGraph=FALSE,
outdir = NULL,
filename=NULL)
Run the code above in your browser using DataLab