ppa.unique: Filter co-modules that are very similar to each other

Description

From a potentially non-unique set of PPA co-modules, create a unique set by removing all co-modules that are similar to others.

Usage

# S4 method for list,list
ppa.unique (normed.data, pparesult, ...)

Value

A named list, the filtered pparesult. See the return value of

ppa.iterate for the details.

Arguments

normed.data: The normalized input data, a list of four matrices, usually the output of the ppa.normalize function.
pparesult: The result of a PPA run, a set of co-modules.
...: Additional arguments, see details below.

Author

Gabor Csardi Gabor.Csardi@unil.ch

Details

This function can be called as


    ppa.unique(normed.data, pparesult, method = c("cor"),
               ignore.div = TRUE, cor.limit = 0.9,
	       neg.cor = TRUE, drop.zero = TRUE)

where the arguments are:

normed.data: The normalized input data, a list of four matrices, usually the output of ppa.normalize.
pparesult: The results of a PPA run, a set of co-modules.
method: Character scalar giving the method to be used to determine if two co-modules are similar. Right now only ‘cor’ is implemented, this keeps both co-modules if their Pearson correlation is less than cor.limit, for their row1, row2 and column scores. See also the neg.cor argument.
ignore.div: Logical scalar, if TRUE, then the divergent co-modules will be removed.
cor.limit: Numeric scalar, giving the correlation limit for the ‘cor’ method.
neg.cor: Logical scalar, if TRUE, then the ‘cor’ method considers the absolute value of the correlation.
drop.zero: Logical scalar, whether to drop co-modules that have all zero scores.

References

Kutalik Z, Bergmann S, Beckmann, J: A modular approach for integrative analysis of large-scale gene-expression and drug-response data Nat Biotechnol 2008 May; 26(5) 531-9.

Examples

Run this code

## Create an PPA module set
set.seed(1)
insili <- ppa.in.silico(noise=0.01)

## Random seeds
seeds <- generate.seeds(length=nrow(insili[[1]]), count=20)

## Normalize input matrix
nm <- ppa.normalize(insili[1:2])

## Do PPA
ppares <- ppa.iterate(nm, row1.seeds=seeds,
                      thr.row1=2, thr.row2=2, thr.col=1)

## Check correlation among modules
cc <- cor(ppares$rows1)
if (interactive()) { hist(cc[lower.tri(cc)],10) }

## Some of them are quite high, how many?
undiag <- function(x) { diag(x) <- 0; x }
sum(undiag(cc) > 0.99, na.rm=TRUE)

## Eliminate duplicated modules
ppares.unique <- ppa.unique(nm, ppares)

## How many modules left?
ncol(ppares.unique$rows1)

## Double check
cc2 <- cor(ppares.unique$rows1)
if (interactive()) { hist(cc2[lower.tri(cc2)],10) }

## High correlation?
sum(undiag(cc2) > 0.99, na.rm=TRUE)

Run the code above in your browser using DataLab