PermHclust.sigclust: 'SCBiclust' method for identifying hierarchically clustered biclusters

Description

'SCBiclust' method for identifying hierarchically clustered biclusters

Usage

PermHclust.sigclust(
  x = NULL,
  method = c("average", "complete", "single", "centroid"),
  wbound = sqrt(ncol(x)),
  alpha = 0.05,
  dat.perms = 1000,
  dissimilarity = c("squared.distance", "absolute.value"),
  silent = TRUE,
  sigstep = FALSE
)

Value

The function returns a S3-object with the following attributes:

which.x: A list of length num.bicluster with each list entry containing a logical vector denoting if the data observation is in the given bicluster.
which.y: A list of length num.bicluster with each list entry containing a logical vector denoting if the data feature is in the given bicluster.

Arguments

x: a dataset with n rows and p columns, with observations in rows.
method: method for agglomeration. See documentation in hclust. (default="average")
wbound: the tuning parameter for sparse hierarchical clustering. See documentation in HierarchicalSparseCluster. (default=sqrt(ncol(x)))
alpha: significance level for sigclust test.
dat.perms: number of \(Beta(\frac{1}{2}, (p-1)/2)\) distributed variables generated for each feature (default=1000)
dissimilarity: How should dissimilarity be calculated? (default is "squared.distance").
silent: should progress be printed? (default=TRUE)
sigstep: Should sigclust be used to assess the strength of identified clusters? (default=FALSE)

Author

Erika S. Helgeson, Qian Liu, Guanhua Chen, Michael R. Kosorok , and Eric Bair

Details

Observations in the bicluster are identified such that they maximize the feature-weighted version of the dissimilarity matrix as implemented in HierarchicalSparseCluster. Features in the bicluster are identified based on their contribution to the clustering of the observations. #' This algoritm uses a numerical approximation to \(E(\sqrt{B})\) where \(B \sim Beta(\frac{1}{2}, (p-1)/2)\) as the expected null distribution for feature weights.

Examples

Run this code

test <- matrix(nrow=500, ncol=50)
theta <- rep(NA, 500)
theta[1:300] <- runif(300, 0, pi)
theta[301:500] <- runif(200, pi, 2*pi)
test[1:300,seq(from=2,to=40,by=2)] <- -2+5*sin(theta[1:300])
test[301:500,seq(from=2,to=40,by=2)] <- 5*sin(theta[301:500])
test[1:300,seq(from=1,to=39,by=2)] <- 5+5*cos(theta[1:300])
test[301:500,seq(from=1,to=39,by=2)] <- 5*cos(theta[301:500])
test[,1:40] <- test[,1:40] + rnorm(40*500, 0, 0.2)
test[,41:50] <- rnorm(10*500, 0, 1)
test.PermBiclust <- PermHclust.sigclust(x=test, method='single', dissimilarity='squared.distance')

Run the code above in your browser using DataLab