PermBiclust.beta.ks: 'SCBiclust' method for identifying means-based biclusters with Kolmogorov-Smirnov test of feature weights

Description

'SCBiclust' method for identifying means-based biclusters with Kolmogorov-Smirnov test of feature weights

Usage

PermBiclust.beta.ks(
  x,
  nperms = 1000,
  silent = TRUE,
  maxnum.bicluster = 5,
  ks.alpha = 0.05
)

Value

The function returns a S3-object with the following attributes:

num.bicluster: The number of biclusters estimated by the procedure.
x.residual: The data matrix x after removing the signals
which.x: A list of length num.bicluster with each list entry containing a logical vector denoting if the data observation is in the given bicluster.
which.y: A list of length num.bicluster with each list entry containing a logical vector denoting if the data feature is in the given bicluster.

Arguments

x: a dataset with n rows and p columns, with observations in rows.
nperms: number of \(Beta(\frac{1}{2}, (p-1)/2)\) distributed variables generated for each feature (default=1000)
silent: should progress be printed? (default=TRUE)
maxnum.bicluster: The maximum number of biclusters returned
ks.alpha: significance level for Kolmogorov-Smirnov test.

Author

Erika S. Helgeson, Qian Liu, Guanhua Chen, Michael R. Kosorok , and Eric Bair

Details

Observations in the bicluster are identified such that they maximize the feature-weighted square-root of the between cluster sum of squares. Features in the bicluster are identified based on their contribution to the clustering of the observations. Feature weights are generated in a similar fashion as KMeansSparseCluster

except with a modified objective function and no sparsity constraint.

This algoritm uses a numerical approximation to \(E(\sqrt{B})\) where \(B \sim Beta(\frac{1}{2}, (p-1)/2)\) as the expected null distribution for feature weights. The Kolmogorov-Smirnov test is used to assess if feature weights deviate from the expected null distribution.

Examples

Run this code

test <- matrix(rnorm(100*200), nrow=100, ncol=200)
test[1:20,1:20] <- test[1:20,1:20]+rnorm(20*20, 2)
test[16:30,51:80] <- test[16:30,51:80]+rnorm(15*30, 3)
PermBiclust.beta.ks(test, silent=TRUE)

Run the code above in your browser using DataLab