getRelevantEGenes: Automatic selection of most relevant effect reporters

Description

1. A-priori filtering of effect reporters/E-genes: Select effect reporters, which show a pattern of differential expression across experiments that is expected to be non-random. 2. Automated effect reporters subset selection: Select those effect reporters, which have the highest likelihood under the given network hypothesis.

Usage

filterEGenes(Porig, D, Padj=NULL, ntop=100, fpr=0.05, adjmethod="bonferroni", cutoff=0.05)
getRelevantEGenes(Phi, D, control, nEgenes=min(10*nrow(Phi), nrow(D)))

Arguments

Porig

matrix of raw p-values, typically from the complete array

data matrix. Columns correspond to the nodes in the silencing scheme. Rows are effect reporters.

Padj

matrix of false positive rates. If not, provided Benjamini-Hochbergs method for false positive rate computation is used.

ntop

number of top genes to consider from each knock-down experiment

fpr

significance cutoff for the FDR

adjmethod

adjustment method for pattern p-values

cutoff

significance cutoff for patterns

Phi

adjacency matrix with unit main diagonal

control

list of parameters: see set.default.parameters

nEgenes

no. of E-genes to select

Value

I: index of selected E-genes
dat: subset of original data according to I
patterns: significant patterns
nobserved: no. of cases per observed pattern
selected: selected E-genes
mLL: marginal likelihood of a phenotypic hierarchy
pos: posterior distribution of effect positions in the hierarchy
mappos: Maximum a posteriori estimate of effect positions
LLperGene: likelihood per selected E-gene

Details

The method filterEGenes performs an a-priori filtering of the complete microarray. It determines how often E-genes are expected to be differentially expressed across experiments just randomly. According to this only E-genes are chosen, which show a pattern of differential expression more often than can be expected by chance.

The method getRelevantEGenes looks for the E-genes, which have the highest likelihood under the given network hypothesis. In case of the scoring type "CONTmLLBayes" these are all E-genes which have a positive contribution to the total log-likelihood. In case of type "CONTmLLMAP" all E-genes not assigned to the "null" S-gene are returned. This involves the prior probability delta/no. S-genes for leaving out an E-gene. For all other cases ("CONTmLL", "FULLmLL", "mLL") the nEgenes E-genes with the highest likelihood under the given network hypothesis are returned.

Examples

Run this code

   # Drosophila RNAi and Microarray Data from Boutros et al, 2002
   data("BoutrosRNAi2002")
   D <- BoutrosRNAiDiscrete[,9:16]

   # enumerate all possible models for 4 genes
   Sgenes = unique(colnames(D))
   models <- enumerate.models(Sgenes)  
   
   getRelevantEGenes(models[[64]], D, control=set.default.parameters(Sgenes, para=c(.13,.05), type="mLL"))

Run the code above in your browser using DataLab