hyperGoutput: Output Tables Based on Hypergeometric Test

Description

This function will output various tables containing probesets that are annotated to a particular GO, KEGG, or PFAM term. The tables are based on the results from a call to hyperGtest.

Usage

hyperGoutput(hyptObj, eset, pvalue, categorySize, sigProbesets, fit = NULL, subset = NULL, comp = 1, output = c("significant", "all", "split"), statistics = c("tstat", "pval", "FC"), html = TRUE, text = TRUE, ...)

Arguments

hyptObj

A HyperGResult object, usually produced by a call to hyperGTest

eset

An ExpressionSet object

pvalue

The p-value cutoff used for selecting significant GO terms. If not specified, it will be extracted from the HyperGResult object

categorySize

Number of terms in the universe required for a term to be significant. See details for more information

sigProbesets

Vector of probeset IDs that were significant in the original analysis.

fit

An MArrayLM object, produced from a call to eBayes

subset

Numeric vector used to select particular tables to output. The default is to output tables for all terms. See details for more information

comp

Numeric vector of length one, used to indicate which comparison in the MArrayLM object to use for extracting relevant statistics. See details for more information

output

One of 'selected', 'all', or 'split'. See details for more information

statistics

Which statistics to output in the resulting tables. Choices include 'tstat', 'pval', or 'FC', corresponding to t-statistics, p-values, and fold change, respectively

html

Boolean. Output HTML tables? Defaults to TRUE

text

Boolean. Output text tables? Defaults to TRUE

...

Allows end user to pass further arguments. The most notable would be an anncols argument, passed to probes2table to control the hyperlinked annotation columns. See aaf.handler for more information

Value

This function returns no value, and is called solely for the side effect of outputting HTML and/or text tables.

Details

This function is designed to be used to output the results from a hypergeometric test for over-represented terms. This function would be used at the end of an analysis such as:

1.) Compute expression values 2.) Fit a model using limma 3.) Output significant probesets using limma2annaffy 4.) Perform hypergeometric test using hyperGTest

At step 4, one can output a list of the over-represented terms using htmlReport. One might then be interested in knowing which probesets contributed to the significance of a particular term, which is what this function is designed to do.

One argument that can be passed to htmlReport (and also to hyperGoutput) is categorySize, which gives a lower bound for the number of probesets with a particular term in the universe. In other words, assume that a particular GO term is annotated to three probesets on a given chip. If, after doing a t-test to detect differentially expressed probesets, one of those probesets were found to be significantly differentially expressed and was then used to do a hypergeometric test, that GO term would be significant, with a small p-value. However, this is probably not very strong evidence that the GO term is actually over-represented, since there were only three to begin with. By setting categorySize to a sensible value (such as 10), this situation can be avoided.

This function will output HTML and/or text tables containing annotation information about each probeset as well as the expression values. In addition, if limma were used to fit the model, the relevant statistics (t-statistic, p-value, fold change) can also be output in the table by passing the MArrayLM object that resulted from a

call to eBayes. The statistics argument can

be used to control which statistics are output.

By default hyperGoutput will output tables for all significant terms, which may end up being quite a few tables. Usually only a few terms are of interest, so there is a subset argument that can be used to select only those terms. This argument follows directly from the order of the table output by htmlReport or summary. For instance, if the first, third and fifth terms in the HTML table output by htmlReport were of interest, one would use subset=c(1,3,5).

One critical step prior to the hypergeometric test is to subset the probesets to unique Entrez Gene IDs. It should be noted however, that the functions used by hypergOutput will output all the probesets annotated to a particular term. The output argument is used to control this behavior. If output = "significant" (the default), then only those probesets that correspond to the original subsetting will be output. If output = "all", then all probesets will be output (grouped by Entrez ID), with the 'significant' probeset first. If output = "split", then all the probesets will be output, with all the 'significant' probesets first, followed by the other probesets, grouped by Entrez ID.

Note that the 'significant' probesets come from one of two sources. First, one can pass a character vector of probeset IDs corresponding to those that were significant in the original analysis (recommended). Second, if the geneIds slot of the GOHyperGParams object containes a named vector of Entrez Gene IDs, then the names from that vector will be used. This can be accomplished by using either findLargest or getUniqueLL.

Since the geneIds are by definition a unique set of Entrez Gene IDs, any duplicate probeset IDs will have been removed, so the first method is to be preferred for accuracy.

Description

Usage

Arguments

Value

Details

See Also