isa2 (version 0.3.4)

isa: Iterative Signature Algorithm

Description

Run ISA with the default parameters

Usage

"isa"(data, ...)

Arguments

data
The input. It must be a numeric matrix. It may contain NA and/or NaN values, but then the algorithm might be a bit slower, as R matrix multiplication might be slower for these matrices, depending on your platform.
...
Additional arguments, see details below.

Value

A named list is returned with the following elements:
rows
The row components in the biclusters, a numeric matrix. Every column in it corresponds to a bicluster, if an element (the score of the row) is non-zero, that means that the row is included in the bicluster, otherwise it is not. Scores are between -1 and 1. If the scores of two rows have the same (nonzero) sign, that means that the two corresponding rows “behave” the same way. If they have opposite sign, that means that they behave the opposite way.If the corresponding seed has not converged during the allowed number of iterations, then that column of rows contains NA values.
columns
The column components of the biclusters, in the same format as the rows.If the corresponding seed has not converged during the allowed number of iterations, then that column of columns contains NA values.
seeddata
A data frame containing information about the biclusters. There is one row for each bicluster. The data frame has the following columns:
iterations
The number of iterations needed to converge to the bicluster.
oscillation
The oscillation period for oscillating biclusters. It is zero for non-oscillating ones.
thr.row
The row threshold that was used for find the bicluster.
thr.col
The column threshold that was used for finding the bicluster.
freq
The number of times the bicluster was found.
rob
The robustness score of the bicluster, see robustness for details.
rob.limit
The robustness limit that was used to filter the module. See isa.filter.robust for details.
rundata
A named list with information about the ISA runs. It has the following entries:
direction
Character vector of length two. Specifies which side(s) of the score distribution were kept in each ISA step. See the direction argument of isa.iterate for details.
convergence
Character scalar. The convergence criteria for the iteration. See the convergence argument of isa.iterate for details.
eps
Numeric scalar. The threshold for convergence, if the ‘eps’ convergence criteria was used.
cor.limit
Numeric scalar. The threshold for convergence, if the ‘cor’ convergence criteria was used.
corx
Numeric scalar, the shift in number of iterations, to check convergence. See the convergence argument of isa.iterate for details.
maxiter
Numeric scalar. The maximum number of iterations that were allowed for an input seed.
N
Numeric scalar. The total number of seeds that were used for all the thresholds.
prenormalize
Logical scalar. Whether the data was pre-normalized.
hasNA
Logical scalar. Whether the (normalized) data had NA or NaN values.
unique
Logical scalar. Whether the similar biclusters were merged by calling isa.unique.
oscillation
Logical scalar. Whether the algorithm looked for oscillating modules as well.
rob.perms
Numeric scalar, the number of permutations that were used to calculate the baseline robustness for filtering. See the perms argument of the isa.filter.robust function for details.

Details

Please read the isa2-package manual page for an introduction on ISA.

This function can be called as

    isa(data, thr.row=seq(1,3,by=0.5),
        thr.col=seq(1,3,by=0.5), no.seeds=100,
        direction=c("updown", "updown"))
      
where the arguments are:
data
The input. It must be a numeric matrix. It may contain NA and/or NaN values, but then the algorithm might be a bit slower, as R matrix multiplication might be slower for these matrices, depending on your platform.

thr.row
Numeric vector. The row threshold parameters for which the ISA will be run. We use all possible combinations of thr.row and thr.col.

thr.col
Numeric vector. The column threshold parameters for which the ISA will be run. We use all possible combinations of thr.row and thr.col.

no.seeds
Integer scalar, the number of seeds to use.

direction
Character vector of length two, one for the rows, one for the columns. It specifies whether we are interested in rows/columns that are higher (‘up’) than average, lower than average (‘down’), or both (‘updown’).

The isa function provides an easy to use interface to the ISA. It runs all steps of a typical ISA work flow with their default parameters.

This involves:

  1. Normalizing the data by calling isa.normalize.
  2. Generating random input seeds via generate.seeds.
  3. Running ISA with all combinations of given row and column thresholds, (by default 1, 1.5, 2, 2.5, 3); by calling isa.iterate.
  4. Merging similar modules, separately for each threshold combination, by calling isa.unique.
  5. Filtering the modules separately for each threshold combination, by calling isa.filter.robust.
  6. Putting all modules from the runs with different thresholds into a single object.
  7. Merging similar modules, across all threshold combinations, if two modules are similar, then the larger one, the one with the milder thresholds is kept.

Please see the manual pages of these functions for the details or if you want to change their default parameters.

References

Bergmann S, Ihmels J, Barkai N: Iterative signature algorithm for the analysis of large-scale gene expression data Phys Rev E Stat Nonlin Soft Matter Phys. 2003 Mar;67(3 Pt 1):031902. Epub 2003 Mar 11. Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N: Revealing modular organization in the yeast transcriptional network Nat Genet. 2002 Aug;31(4):370-7. Epub 2002 Jul 22

Ihmels J, Bergmann S, Barkai N: Defining transcription modules using large-scale gene expression data Bioinformatics 2004 Sep 1;20(13):1993-2003. Epub 2004 Mar 25.

See Also

isa2-package for a short introduction on the Iterative Signature Algorithm. See the functions mentioned above if you want to change the default ISA parameters.

Examples

Run this code
## We generate some noisy in-silico data with modules and try to find
## them with the ISA. This might take one or two minutes.
data <- isa.in.silico(noise=0.1)
isa.result <- isa(data[[1]])

## Find the best bicluster for each block in the input
best <- apply(cor(isa.result$rows, data[[2]]), 2, which.max)

## Check correlation
sapply(seq_along(best),
       function(x) cor(isa.result$rows[,best[x]], data[[2]][,x]))

## The same for the columns
sapply(seq_along(best),
       function(x) cor(isa.result$columns[,best[x]], data[[3]][,x]))

## Plot the data and the modules found
if (interactive()) {
  layout(rbind(1:2,3:4))
  image(data[[1]], main="In-silico data")
  sapply(best, function(b) image(outer(isa.result$rows[,b],
                                       isa.result$columns[,b]),
                                 main=paste("Module", b)))  
}

Run the code above in your browser using DataLab