(1) BuildRN:
This function builds a relevance correlation network of the model pathway signature in the data set in which the pathway activity estimate is desired. We point that this step is totally unsupervised and does not use and phenotypic information of the samples.
(2) EvalConsNet:
This function evaluates the consistency of the inferred network with the prior information of the model pathway signature. The up/down regulatory pattern given by the model signature implies predictions about the directionality of the gene-gene correlations in the independent data set. For instance, if gene "A" is upregulated and gene "B" is downregulated, then assuming that the model signature has any relevance in the independent data set, we would expect genes "A" and "B" to be anti-correlated. Thus, a consistency score can be computed. Only if the consistency score is higher than the score expected by random chance is it recommended that the model signature be used to infer pathway activity.
(3) PruneNet:
This function obtains the pruned, i.e consistent, network, in which any edge represents a significant correlation in gene expression whose directionality agrees with that predicted by the prior information. This is the denoising step of the algorithm. The function returns the whole pruned network and its maximally connected component.
(4) PredActScore:
Given the adjacency matrix of the maximally connected consistent subnetwork and given the regulatory weights of the corresponding model pathway signature, this function estimates a pathway activation score in each sample. This function can also be used to infer pathway activity in another independent data set using the inferred subnetwork.
Before performing the pruning step, DoDART
will check whether
the relevance correlation network is significantly consistent with the
predictions from the model signature. Significance is assessed by
first computing a consistency score (in effect, the fraction of edges in the
relevance network which are consistent with the model prediction) and
subsequently by 1000 random permutations to obtain an empirical null
distribution for the consistency score. Model signatures whose consistency scores have
empirical P-values less than 0.001 are deemed consistent. If the
consistency score is not significant, the function will issue a warning and it is not recommended to use the signature to predict pathway activity.
DoDART(data.m, sign.v, fdr)
data.m
must be valid unique gene (probe) identifiers.sign.v
are the gene identifiers, which must match the gene (probe) identifiers of the rows of data.m
BuildRN
, fE is the ratio of the number of edges in the relevance network to the maximum possible number, fconsE is the fraction of edges whose sign (i.e sign of correlation) is the same as the directionality predicted by the model signature, Pval(consist) is a p-value reflecting the significance of fconsE, and is estimated as the fraction of randomisations that yielded an average connectivity larger than the observed one.data.m
.EvalConsNet
.Teschendorff AE, Gomez S, Arenas A, El-Ashry D, Schmidt M, et al. (2010) Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules. BMC Cancer 10:604.
### Example
### load in example data:
data(dataDART);
### dataDART$data: mRNA expression data of 67 ER negative breast cancer samples.
### dataDART$pheno: 51 basals and 16 HER2+ (ERBB2+).
### dataDART$phenoMAINZ: 24 basals and 8 HER2+ (ERBB2+).
### dataDART$sign: perturbation signature of ERBB2 activation.
### Using DoDART
dart.o <- DoDART(dataDART$data,dataDART$sign,fdr=0.000001);
### check that activation is higher in HER2+ compared to basals
boxplot(dart.o$score ~ dataDART$pheno);
pv <- wilcox.test(dart.o$score ~ dataDART$pheno)$p.value;
text(x=1.5,y=3.8,labels=paste("P=",pv,sep=""));
Run the code above in your browser using DataLab