Learn R Programming

tmod (version 0.19)

tmodUtest: Perform a statistical test of module expression

Description

Perform a statistical test of module expression

Usage

tmodUtest(l, modules = NULL, qval = 0.05, order.by = "pval",
  filter = FALSE, mset = "LI", cols = "Title", useR = FALSE)

tmodCERNOtest(l, modules = NULL, qval = 0.05, order.by = "pval", filter = FALSE, mset = "LI", cols = "Title", useR = FALSE)

tmodHGtest(fg, bg, modules = NULL, qval = 0.05, order.by = "pval", filter = FALSE, mset = "LI", cols = "Title")

Arguments

l
sorted list of HGNC gene identifiers
modules
optional list of modules for which to make the test
qval
Threshold FDR value to report
order.by
Order by P value ("pval") or none ("none")
filter
Remove gene names which have no module assignments
mset
Which module set to use. Either a character vector ("LI", "DC" or "all", default: LI) or a list (see "Custom module definitions" below)
cols
Which columns from the MODULES data frame should be included in resulsts
useR
use the R wilcox.test function; slow, but with exact p-values for small samples
fg
foreground gene set for the HG test
bg
background gene set for the HG test

Value

  • A data frame with module names, additional statistic (e.g. enrichment or AUC, depending on the test), P value and FDR q-value (P value corrected for multiple testing using the p.adjust function and Benjamini-Hochberg correction.

Custom module definitions

Custom and arbitrary module, gene set or pathway definitions can be also provided through the mset option, if the parameter is a list rather than a character vector. The list parameter to mset must contain the following members: "MODULES", "MODULES2GENES" and "GENES".

"MODULES" and "GENES" are data frames. It is required that MODULES contains the following columns: "ID", specifying a unique identifier of a module, and "Title", containing the description of the module. The data frame "GENES" must contain the column "ID".

The list MODULES2GENES is a mapping between modules and genes. The names of the list must correspond to the ID column of the MODULES data frame. The members of the list are character vectors, and the values of these vectors must correspond to the ID column of the GENES data frame.

Details

Performs a test on either on an ordered list of genes (tmodUtest, tmodCERNOtest) or on two groups of genes (tmodHGtest). tmodUtest is a U test on ranks of genes that are contained in a module.

tmodCERNOtest is also a nonparametric test working on gene ranks, but it originates from Fisher's combined probability test. This test weights genes with lower ranks more, the resulting p-values better correspond to the observed effect size. In effect, modules with small effect but many genes get higher p-values than in case of the U-test.

tmodHGtest is simply a hypergeometric test.

In tmod, two module sets can be used, "LI" (from Li et al. 2013), or "DC" (from Chaussabel et al. 2008). Using the parameter "mset", the module set can be selected, or, if mset is "all", both of sets are used.

See Also

tmod-package

Examples

Run this code
data(tmod)
fg <- tmod$MODULES2GENES[["LI.M127"]]
bg <- tmod$GENES$ID
result <- tmodHGtest( fg, bg )

## A more sophisticated example
## Gene set enrichment in TB patients compared to
## healthy controls (Egambia data set)

library(limma)
data(Egambia)
design <- cbind(Intercept=rep(1, 30), TB=rep(c(0,1), each= 15))
fit <- eBayes( lmFit(Egambia[,-c(1:3)], design))
tt <- topTable(fit, coef=2, number=Inf, genelist=Egambia[,1:3] )
tmodUtest(tt$GENE_SYMBOL)

Run the code above in your browser using DataLab