calcCoGAPSStat: CoGAPS gene set statistic

Description

Computes the p-value for the association of underlying patterns from microarray data to activity in gene sets.

Usage

calcCoGAPSStat(Amean, Asd, GStoGenes, numPerm=500)

Arguments

Amean

Sampled mean value of the amplitude matrix ${{A}}$. row.names(Amean) must correspond to the gene names contained in GStoGenes.

Asd

Sampled standard deviation of the amplitude matrix ${{A}}$.

GStoGenes

List or data frame containing the genes in each gene set. If a list, gene set names are the list names and corresponding elements are the names of genes contained in each set. If a data frame, gene set names are in the first column and corresponding gene names are listed in rows beneath each gene set name.

numPerm

Number of permuations used for the null distribution in the gene set statistic. (optional; default=500)

Value

A list containing:
GSUpregp-values for upregulation of each gene set in each pattern.
GSDownregp-values for downregulation of each gene set in each pattern.
GSActEstp-values for activity of each gene set in each pattern.

Details

This script links the patterns identified in the columns of ${P}$ to activity in each of the gene sets specified in GStoGenes using a novel z-score based statistic developed in Ochs et al. (2009). Specifically, the z-score for pattern $p$ and gene set $G_{i}$ containing $G$ total genes is given by $$Z_{i,p} = \frac{1}{G} \sum_{g in G_{i}}A_{gp} / \sigma_{gp}$$, where $g$ indexes the genes in the set and $\sigma_{gp}$ is the standard deviation of ${{A}}_{gp}$ obtained from MCMC sampling. CoGAPS then uses the specified numPerm random sample tests to compute a consistent p value estimate from that z score.

References