limma (version 3.28.14)

camera: Competitive Gene Set Test Accounting for Inter-gene Correlation

Description

Test whether a set of genes is highly ranked relative to other genes in terms of differential expression, accounting for inter-gene correlation.

Usage

"camera"(y, index, design, contrast = ncol(design), weights = NULL, use.ranks = FALSE, allow.neg.cor=FALSE, inter.gene.cor=0.01, trend.var = FALSE, sort = TRUE, ...) interGeneCorrelation(y, design)

Arguments

y
a numeric matrix of log-expression values or log-ratios of expression values, or any data object containing such a matrix. Rows correspond to probes and columns to samples. Any type of object that can be processed by getEAWP is acceptable.
index
an index vector or a list of index vectors. Can be any vector such that y[index,] selects the rows corresponding to the test set. The list can be made using ids2indices.
design
design matrix.
contrast
contrast of the linear model coefficients for which the test is required. Can be an integer specifying a column of design, or else a numeric vector of same length as the number of columns of design.
weights
can be a numeric matrix of individual weights, of same size as y, or a numeric vector of array weights with length equal to ncol(y), or a numeric vector of gene weights with length equal to nrow(y).
use.ranks
do a rank-based test (TRUE) or a parametric test (FALSE)?
allow.neg.cor
should reduced variance inflation factors be allowed for negative correlations?
inter.gene.cor
numeric, optional preset value for the inter-gene correlation within tested sets. If NA or NULL, then an inter-gene correlation will be estimated for each tested set.
trend.var
logical, should an empirical Bayes trend be estimated? See eBayes for details.
sort
logical, should the results be sorted by p-value?
...
other arguments are not currently used

Value

camera returns a data.frame with a row for each set and the following columns:
NGenes
number of genes in set.
Correlation
inter-gene correlation (only included if the inter.gene.cor was not preset).
Direction
direction of change ("Up" or "Down").
PValue
two-tailed p-value.
FDR
Benjamini and Hochberg FDR adjusted p-value.
interGeneCorrelation returns a list with components:
vif
variance inflation factor.
correlation
inter-gene correlation.

Details

camera and interGeneCorrelation implement methods proposed by Wu and Smyth (2012). camera performs a competitive test in the sense defined by Goeman and Buhlmann (2007). It tests whether the genes in the set are highly ranked in terms of differential expression relative to genes not in the set. It has similar aims to geneSetTest but accounts for inter-gene correlation. See roast for an analogous self-contained gene set test.

The function can be used for any microarray experiment which can be represented by a linear model. The design matrix for the experiment is specified as for the lmFit function, and the contrast of interest is specified as for the contrasts.fit function. This allows users to focus on differential expression for any coefficient or contrast in a linear model by giving the vector of test statistic values.

camera estimates p-values after adjusting the variance of test statistics by an estimated variance inflation factor. The inflation factor depends on estimated genewise correlation and the number of genes in the gene set.

By default, camera uses interGeneCorrelation to estimate the mean pair-wise correlation within each set of genes. camera can alternatively be used with a preset correlation specified by inter.gene.cor that is shared by all sets. This usually works best with a small value, say inter.gene.cor=0.01.

If interGeneCorrelation=NA, then camera will estimate the inter-gene correlation for each set. In this mode, camera gives rigorous error rate control for all sample sizes and all gene sets. However, in this mode, highly co-regulated gene sets that are biological interpretable may not always be ranked at the top of the list.

With interGeneCorrelation=0.01, camera will rank biologically interpetable sets more highly. This gives a useful compromise between strict error rate control and interpretable gene set rankings.

References

Wu, D, and Smyth, GK (2012). Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research 40, e133. http://nar.oxfordjournals.org/content/40/17/e133

Goeman, JJ, and Buhlmann, P (2007). Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980-987.

See Also

getEAWP

rankSumTestWithCorrelation, geneSetTest, roast, fry, romer, ids2indices.

There is a topic page on 10.GeneSetTests.

Examples

Run this code
y <- matrix(rnorm(1000*6),1000,6)
design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1))

# First set of 20 genes are genuinely differentially expressed
index1 <- 1:20
y[index1,4:6] <- y[index1,4:6]+1

# Second set of 20 genes are not DE
index2 <- 21:40
 
camera(y, index1, design)
camera(y, index2, design)

camera(y, list(set1=index1,set2=index2), design, inter.gene.cor=NA)
camera(y, list(set1=index1,set2=index2), design, inter.gene.cor=0.01)

Run the code above in your browser using DataLab