Learn R Programming

GOCompare (version 1.0.2.2)

graph_two_GOspecies: Undirected network representation for the results of functional enrichment analysis to compare two species and a series of categories

Description

graph_two_GOspecies is a function to create undirected graphs

The graph_two_GOspecies is an analog of the graphGOspecies function, and it has the same options (" Categories " and " GO "). Nevertheless, the way in which the edge and node weights are calculated is slightly different. Since two species are compared, three possible graphs are available G_1,\, G_2, and G_3 . G_1, and G_2 represent each of the species analyzed and G_3 is a subgraph of G_1,\, G_2, which contains the GO terms or Categories co-ocurring between both species.

Categories option: (Weight): The nodes (V) represent groups of gene lists (categories), and the edges (E) represent GO terms co-occurring between pairs of categories and the weight of the nodes provides a measure of how a GO term is conserved between two species and a series of categories but it is biased to categories.

K_w(u)=_v V_1^w(u,v) + _v V_2^w(u,v) (5)

(shared weight): The nodes (V) represent groups of gene lists (categories), and the edges (E) represent GO terms co-occurring between pairs of categories that are only shared between species. This node weight K_s is computed from a shared weight of edges s, where N1 and N2 are the set of GO terms associated with the edge e = (u,v) for species 1 and 2, respectively. Therefore the node shared weight K_s(u) is the sum of s.

s(e) = N1 \ n \ N2 N1 N2 (6)

K_s(u)=_v (V_1 V_2) ^s(u,v) (7)

(combined weight): This node weight K_c(u) is a combination of the weight and the shared weight. The idea of this combined weight is to find categories with more frequent GO terms co-ocurring in order to observe functional similarities between two species with a balance of GO terms co-occurring among gene lists (categories) and the two species. This node weight varies from -1 (categories with GO terms found only in one species and few categories) to 1 (categories with GO terms shared widely between species and among other categories). the combined node weight K_c is defined as the sum of the min-max normalized weights K_w and K_s minus 1.

minmax(y)=y-min(y)max(y)-min(y) (8) K_c(u)= minmax(K_w(u)) + minmax(K_s(u)) - 1 (9)

GO option: Given there are three possible graphs are available G_1,\, G_2, and G_3. G_1, and G_2 represent each of the species analyzed and G_3 is a subgraph of G_1,\, G_2, which contains the GO terms or Categories co-ocurring between both species. For this case, Nodes are GO terms and edges are categories where a GO terms is co-ocurring. This weight is similar to the GO weight calculated for graphGOspecies function. it is calculated as the equation 5.

K_w(u)=_v V_1^w(u,v) + _v V_2^w(u,v) (5)

Usage

graph_two_GOspecies(
  x,
  species1,
  species2,
  GOterm_field,
  saveGraph = FALSE,
  option = "Categories",
  numCores = 2,
  outdir = NULL,
  filename = NULL
)

Value

This function will return a list with two slots: edges and nodes. (Categories): Edges list columns:

ColumnDescription
SOURCE and TARGETThe source and target categories (Nodes in the edge)
GO_NThe number of GO terms between the categories
WEIGHTEdge weight
GOGO terms available for both nodes
SP1Number of GO terms for the species 1
SP2Number of GO terms for the species 2
SHAREDNumber of GO terms shared or co-ocurring between the categories
SHARED_WEIGHTShared weight for the edge

Node list columns:

ColumnDescription
CATCategory name
CAT_WEIGHTNode weight
SHARED_WEIGHTShared weight for the node
COMBINED_WEIGHTCombined weight for the node

(GO):

Edges list columns:

ColumnDescription
SOURCE and TARGETThe source and target GO terms (Nodes in the edge)
FEATUREThe number of Categories where both GO Terms were found
SPSpecies where the GO terms was found (Species 1, Species 2 or Shared)
WEIGHTEdge weight

Node list columns:

ColumnDescription
GOGO term node name
GO_WEIGHTNode weight

Arguments

x

is a list obtained as output of the comparegOspecies function

species1

This is a string with the species name for species 1 (e.g; "H. sapiens")

species2

This is a string with the species name for species 2 (e.g; "A. thaliana")

GOterm_field

This is a string with the column name of the GO terms (e.g; "Functional_Category")

saveGraph

logical, if TRUE the function will allow save the graph in graphml format

option

(values: "Categories or "GO"). This option allows create either a graph where nodes are GO terms and edges are features and GO as well as species belonging are edges attributes or a graph where nodes are GO terms and edges are species belonging (default value="Categories")

numCores

numeric, Number of cores to use for the process (default value numCores=2). For the example below, only one core will be used

outdir

This parameter will allow save the graph file in a folder described here (e.g: "D:").This parameter only works when saveGraph=TRUE

filename

The name of the graph filename to be saved in the outdir detailed by the user.This parameter only works when saveGraph=TRUE

Examples

Run this code

GOterm_field <- "Functional_Category"
data(comparison_ex_compress_CH)
#Defining the species names
species1 <- "H. sapiens"
species2 <- "A. thaliana"
x_graph <- graph_two_GOspecies(x=comparison_ex_compress_CH,
          species1=species1,
          species2=species2,
          GOterm_field=GOterm_field,
          numCores=1,
          saveGraph = FALSE,
          option= "Categories",
          outdir = NULL,
          filename= NULL)

Run the code above in your browser using DataLab