hyperGoutput(hyptObj, eset, pvalue, categorySize, sigProbesets, fit = NULL, subset = NULL, comp = 1, output = c("significant", "all", "split"), statistics = c("tstat", "pval", "FC"), html = TRUE, text = TRUE, ...)HyperGResult object, usually produced by a call to
hyperGTestExpressionSet objectHyperGResult objectMArrayLM object to use for extracting
relevant statistics. See details for more informationTRUETRUEanncols argument, passed to probes2table to
control the hyperlinked annotation columns. See
aaf.handler for more information1.) Compute expression values 2.) Fit a model using limma 3.) Output
significant probesets using limma2annaffy 4.) Perform hypergeometric
test using hyperGTest
At step 4, one can output a list of the over-represented terms using
htmlReport. One might then be
interested in knowing which probesets contributed to the significance of a
particular term, which is what this function is designed to do.
One argument that can be passed to
htmlReport (and also to
hyperGoutput) is categorySize, which gives a lower bound for
the number of probesets with a particular term in the universe. In other
words, assume that a particular GO term is annotated to three probesets on a
given chip. If, after doing a t-test to detect differentially expressed
probesets, one of those probesets were found to be significantly
differentially expressed and was then used to do a hypergeometric test, that
GO term would be significant, with a small p-value. However, this is
probably not very strong evidence that the GO term is actually
over-represented, since there were only three to begin with. By setting
categorySize to a sensible value (such as 10), this situation can be
avoided.
This function will output HTML and/or text tables containing annotation
information about each probeset as well as the expression values. In
addition, if limma were used to fit the model, the relevant statistics
(t-statistic, p-value, fold change) can also be output in the table by
passing the MArrayLM object that resulted from
a
call to eBayes. The statistics argument
can
be used to control which statistics are output.
By default hyperGoutput will output tables for all significant terms,
which may end up being quite a few tables. Usually only a few terms are of
interest, so there is a subset argument that can be used to select
only those terms. This argument follows directly from the order of the table
output by htmlReport or
summary. For instance, if the
first, third and fifth terms in the HTML table output by
htmlReport were of interest,
one would use subset=c(1,3,5).
One critical step prior to the hypergeometric test is to subset the
probesets to unique Entrez Gene IDs. It should be noted however, that the
functions used by hypergOutput will output all the probesets
annotated to a particular term. The output argument is used to
control this behavior. If output = "significant" (the default), then only
those probesets that correspond to the original subsetting will be output.
If output = "all", then all probesets will be output (grouped by Entrez ID),
with the 'significant' probeset first. If output = "split", then all the
probesets will be output, with all the 'significant' probesets first,
followed by the other probesets, grouped by Entrez ID.
Note that the 'significant' probesets come from one of two sources. First,
one can pass a character vector of probeset IDs corresponding to those that
were significant in the original analysis (recommended). Second, if the
geneIds slot of the GOHyperGParams object containes a named
vector of Entrez Gene IDs, then the names from that vector will be used.
This can be accomplished by using either
findLargest or getUniqueLL.
Since the geneIds are by definition a unique set of Entrez Gene IDs,
any duplicate probeset IDs will have been removed, so the first method is to
be preferred for accuracy.
hyperGTest,
htmlReport,
probeSetSummary