Learn R Programming

GOFunction (version 1.20.0)

GOFunction: main function of the GO-function package

Description

The GOFunction function is the main function of the GO-function package and can generate a set of biologically relevant GO terms.

Usage

GOFunction(interestGenes, refGenes, organism = "org.Hs.eg.db", ontology = "BP", fdrmethod = "BY", fdrth = 0.05, ppth =  0.05, pcth = 0.05, poth = 0.05, peth = 0.05, bmpSize = 2000, filename = "sigTerm")

Arguments

interestGenes
interestGenes is a set of interesting genes (e.g. differential expressed genes), which should be denoted using the Entrez gene ID.
refGenes
refGenes is the background genes corresponding to the interesting genes, which should be denoted using the Entrez gene ID.
organism
The GO-function package can be currently applied to analyse data for 18 organisms and the user should install the corresponding gene annotation package when analysing data for these organisms. The 18 organisms and the corresponding packages are as follows: Anopheles "org.Ag.eg.db", Bovine "org.Bt.eg.db", Canine "org.Cf.eg.db", Chicken "org.Gg.eg.db", Chimp "org.Pt.eg.db", E coli strain K12 "org.EcK12.eg.db", E coli strain Sakai "org.EcSakai.eg.db", Fly "org.Dm.eg.db", Human "org.Hs.eg.db", Mouse "org.Mm.eg.db", Pig "org.Ss.eg.db", Rat "org.Rn.eg.db", Rhesus "org.Mmu.eg.db", Streptomyces coelicolor "org.Sco.eg.db", Worm "org.Ce.eg.db", Xenopus "org.Xl.eg.db", Yeast "org.Sc.sgd.db", Zebrafish "org.Dr.eg.db". The default organism is "org.Hs.eg.db" (Human).
ontology
The default ontology is "BP" (Biological Process). The "CC" (Cellular Component) and "MF" (Molecular Function) ontologies can also be used.
fdrmethod
GO-function provides three p value correction methods: "bonferroni", "BH" and "BY". The default fdrmethod is "BY".
fdrth
fdrth is the fdr cutoff to identify statistically significant GO terms. The default is 0.05.
ppth
ppth is the significant level to test whether the remaining genes of the ancestor term are enriched with interesting genes after removing the genes in its significant offspring terms. The default is 0.05.
pcth
pcth is the significant level to test whether the frequency of interesting genes in the offspring terms are significantly different from that in the ancestor term. The default is 0.05.
poth
poth is the significant level to test whether the overlapping genes of one term is significantly different from the non-overlapping genes of the term. The default is 0.05.
peth
peth is the significant level to test whether the non-overlapping genes of one term is enriched with interesting genes. The default is 0.05.
bmpSize
bmpSize is the width and height of the plot of GO DAG for all statistically significant terms. GO-function set the default width and height of the plot as 2000 pixels in order to clearly show the GO DAG structure. If the GO DAG is very complexity, the user should increase bmpSize. Note: If there is an error at the step of "bmp(filename, width = 2000, ..." when running GO-function, the user should decrease bmpSize.
filename
filename is the name of the files saving the table and the GO DAG of all statistically significant terms.

Value

There are two types of result output of GO-function. The first type is that GO-function saves a table contained all statistically significant terms to a CSV file (e.g. "sigTerm.csv") in the current working folder. This table contains seven columns: goid, name, refnum (the number of the reference genes in a GO term), interestnum (the number of the interesting genes in a GO term), pvalue, adjustp (the corrected p value by the fdr control), FinalResults. The "FinalResults" contains three types: (1) "Local" represents terms removed after treating for local redundancy; (2) "Global" represents terms removed after treating for global redundancy; (3) "Final" represents the remained terms with evidence that their significance should not be simply due to the overlapping genes. GO-function also saves the structure of GO DAG for all statistic significant terms into a plot (e.g. "sigTerm.bmp") in the current working folder. In this plot, "circle", "box" and "rectangle" represent "Local", "Global" and "Final" terms in the table, respectively. The different color shades represent the adjusted p values of the terms.

Examples

Run this code
       
       data(exampledata)
       sigTerm <- GOFunction(interestGenes, refGenes, organism = "org.Hs.eg.db", ontology= "BP", fdrmethod = "BY", 
       fdrth = 0.05, ppth = 0.05, pcth = 0.05, poth = 0.05, peth = 0.05, bmpSize = 2000, filename="sigTerm")

Run the code above in your browser using DataLab