gage (version 2.22.0)

go.gsets: Generate up-to-date GO (Gene Ontology) gene sets

Description

Generate up-to-date GO gene sets for specified species, which has either Bioconductor or user supplied gene annotation package.

Usage

go.gsets(species = "human", pkg.name=NULL, id.type = "eg")

Arguments

species
character, common name of the target species. Note that scientific name is not used here. Default species="human". For a list of other species names, type in: data(bods); bods
pkg.name
character, the gene annotation package name for the target species. It is either one of the Bioconductor OrgDb packages or a user supplied package with similar data structures created using AnnotationForge package. Default species=NULL, the right annotation package will be located for the specified speices in Bioconductor automatically. Otherwise, the user need to prepare and supply a usable annotation package in the same format.
id.type
character, target ID type for the get sets, case insensitive. Default gene.idtype="eg", i.e. Entrez Gene. Entrez Gene is the primary GO gene ID for many common model organisms, like human, mouse, rat etc. For other species may use "orf" (open reading frame) and other specific gene IDs, please type in: data(bods); bods to check the details.

Value

A named list with the following elements:
go.sets
GO gene sets, a named list. Each element is a character vector of member gene IDs for a single GO. The number of elements of this list is the total number of GO terms defined for the specified species.
go.subs
go.subs is a named list of three elements: "BP", "CC" and "MF", corresponding to biological process, cellular component and molecular function subtrees. It may be more desirable to conduct separated GO enrichment test on each of these 3 subtrees as shown in the example code.
go.mains
go.mains is a named vector of three elements: "BP", "CC" and "MF", corresponding to the root node of biological process, cellular component and molecular function subtree, i.e. their corresponding indecies in go.sets.

Details

The updated GO gene sets are derived using Bioconductor or user supplied gene annotation packages. This way, we can create gene set data for GO analysis for 19 common species annotated in Bioconductor and more others by the users. Note that we have generated GO gene set for 4 species, human, mouse, rat and yeast, and provided the data in package gageData.

References

Luo, W., Friedman, M., Shedden K., Hankenson, K. and Woolf, P GAGE: Generally Applicable Gene Set Enrichment for Pathways Analysis. BMC Bioinformatics 2009, 10:161

See Also

go.gs for precompiled GO and other common gene set data collection

Examples

#GAGE analysis use the updated GO  definitions, instead of
#go.gs.
#the following lines are not run due to long excution time.
#The data generation steop may take 10-30 seconds. You may want
#save the gene set data, i.e. go.hs, for future uses.
#go.hs=go.gsets(species="human")
#data(gse16873)
#hn=(1:6)*2-1
#dcis=(1:6)*2
#go.bp=go.hs$go.sets[go.hs$go.subs$BP]
#gse16873.bp.p <- gage(gse16873, gsets = go.bp,
#                        ref = hn, samp = dcis)

#Yeast and few othr species gene Id is different from Entre Gene
data(bods)
bods
#not run
#go.sc=go.gsets("Yeast")
#lapply(go.sc$go.sets[1:3], head, 3)