Learn R Programming

splineTimeR (version 1.0.1)

pathEnrich: Pathway enrichment analysis

Description

Function performs a pathway enrichment analysis of a definied set of genes.

Usage

pathEnrich(geneList, geneSets, universe=NULL)

Arguments

geneList
vector of gene names to be used for pathway enrichment
geneSets
"GeneSetColletion" object with functional pathways gene sets
universe
number of genes that were probed in the initial experiment

Value

  • A data.frame with following columns:
  • pathwaynames of enriched pathways
  • descriptiongene set description (e.g. a link to the named gene set in MSigDB)
  • genes_in_pathwaytotal number of known genes in the pathway
  • %_matchnumber of matched genes refered to the total number of known genes in the pathway given in %
  • pValuep-value
  • adj.pValueBenjamini-Hochberg adjucted p-value
  • overlapgenes from input genes list that overlap with all known genes in the pathway
  • Additionally an .txt file containing all above information is created.

Details

geneSets is a "GeneSetColletion" object containing gene sets from various databases. Different sources for gene sets data are allowed and have to be provided in Gene Matrix Transposed file format (*.gmt), where each gene set is described by a pathway name, a description, and the genes in the gene set. Two examples are shown to demonstrate how to define geneSets object. See examples.

The variable universe represents a total number of genes that were probed in the initial experiment, e.g. the number of all genes on a microarray. If universe is not definied, universe is equal to the number of all genes that can be mapped to any pathways in chosen database.

References

Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S. and Mesirov, J. P. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102(43), 15545-15550.

http://www.broadinstitute.org/gsea/msigdb/collections.jsp

http://www.reactome.org/pages/download-data/

Examples

Run this code
## Example 1 - using gene sets from the Molecular Signatures Database (MSigDB collections)
   ## Download .gmt file 'c2.all.v5.0.symbols.gmt' (all curated gene sets, gene symbols)
   ## from the Broad, http://www.broad.mit.edu/gsea/downloads.jsp#msigdb, then
   geneSets <- getGmt("/path/to/c2.all.v5.0.symbols.gmt")
   ## load "eSetObject" containing simulated time-course data
   data(TCsimData)
   ## check for differentially expressed genes
   diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
   ## use differentially expressed genes for pathway enrichment analysis
   enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)

## Example 2 - using gene sets from the Reactome Pathway Database
   ## Download and unzip .gmt.zip file 'ReactomePathways.gmt.zip'
   ## ("Reactome Pathways Gene Set" under "Specialized data formats") from the Reactome website
   ## http://www.reactome.org/pages/download-data/, then
   geneSets <- getGmt("/path/to/ReactomePathways.gmt")
   data(TCsimData)
   diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
   enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
   
## Small example with gene sets consist of KEGG pathways only
geneSets <- getGmt(system.file("extdata", "c2.cp.kegg.v5.0.symbols.gmt", package="splineTimeR"))
data(TCsimData)
diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)

Run the code above in your browser using DataLab