Usage
blockwiseConsensusModules(
multiExpr, blocks = NULL,
maxBlockSize = 5000,
randomSeed = 12345,
corType = "pearson",
power = 6,
consensusQuantile = 0,
networkType = "unsigned",
TOMType = "unsigned",
TOMDenom = "min",
scaleTOMs = TRUE, scaleQuantile = 0.95,
sampleForScaling = TRUE, sampleForScalingFactor = 1000,
useDiskCache = TRUE, chunkSize = NULL,
cacheBase = ".blockConsModsCache",
deepSplit = 2,
detectCutHeight = 0.995, minModuleSize = 20,
checkMinModuleSize = TRUE,
maxCoreScatter = NULL, minGap = NULL,
maxAbsCoreScatter = NULL, minAbsGap = NULL,
pamStage = TRUE, pamRespectsDendro = TRUE,
minKMEtoJoin =0.7,
minCoreKME = 0.5, minCoreKMESize = minModuleSize/3,
minKMEtoStay = 0.2,
reassignThresholdPS = 1e-4,
mergeCutHeight = 0.15,
impute = TRUE,
getTOMs = NULL,
saveTOMs = FALSE,
saveTOMFileBase = "consensusTOM",
getTOMScalingSamples = FALSE,
trapErrors = FALSE,
checkPower = TRUE,
numericLabels = FALSE,
checkMissingData = TRUE,
maxPOutliers = 1,
quickCor = 0,
pearsonFallback = "individual",
cosineCorrelation = FALSE,
nThreads = 0,
verbose = 2, indent = 0)
Arguments
multiExpr
expression data in the multi-set format (see checkSets
). A vector of
lists, one per set. Each set must contain a component data
that contains the expression data, with
rows corresponding to blocks
optional specification of blocks in which hierarchical clustering and module detection
should be performed. If given, must be a numeric vector with one entry per gene
of multiExpr
giving the number of the block to which the corresponding ge
maxBlockSize
integer giving maximum block size for module detection. Ignored if blocks
above is non-NULL. Otherwise, if the number of genes in datExpr
exceeds maxBlockSize
, genes
will be pre-clustered into blocks whose size sho
randomSeed
integer to be used as seed for the random number generator before the function
starts. If a current seed exists, it is saved and restored upon exit. If NULL
is given, the
function will not save and restore the seed.
corType
character string specifying the correlation to be used. Allowed values are (unique
abbreviations of) "pearson"
and "bicor"
, corresponding to Pearson and bidweight
midcorrelation, respectively. Missing values are handled using t
power
soft-thresholding power for netwoek construction.
consensusQuantile
qunatile at which consensus is to be defined. See details.
networkType
network type. Allowed values are (unique abbreviations of) "unsigned"
,
"signed"
, "signed hybrid"
. See adjacency
. TOMType
one of "none"
, "unsigned"
, "signed"
. If "none"
, adjacency
will be used for clustering. If "unsigned"
, the standard TOM will be used (more generally, TOM
function will receive the adjacency
TOMDenom
a character string specifying the TOM variant to be used. Recognized values are
"min"
giving the standard TOM described in Zhang and Horvath (2005), and "mean"
in which
the min
function in the denominator is repl
scaleTOMs
should set-specific TOM matrices be scaled to the same scale?
scaleQuantile
if scaleTOMs
is TRUE
, topological overlaps (or adjacencies if
TOMs are not computed) will be scaled such that their scaleQuantile
quantiles will agree.
sampleForScaling
if TRUE
, scale quantiles will be determined from a sample of network
similarities. Note that using all data can double the memory footprint of the function and the function
may fail.
sampleForScalingFactor
determines the number of samples for scaling: the number is
1/scaleQuantile * sampleForScalingFactor
. Should be set well above 1 to ensure accuracy of the
sampled quantile.
useDiskCache
should calculated network similarities in individual sets be temporarilly saved
to disk? Saving to disk is somewhat slower than keeping all data in memory, but for large blocks and/or
many sets the memory footprint may be too big.
chunkSize
network similarities are saved in smaller chunks of size chunkSize
.
cacheBase
character string containing the desired name for the cache files. The actual file
names will consists of cacheBase
and a suffix to make the file names unique.
deepSplit
integer value between 0 and 4. Provides a simplified control over how sensitive
module detection should be to module splitting, with 0 least and 4 most sensitive. See
cutreeDynamic
for detectCutHeight
dendrogram cut height for module detection. See
cutreeDynamic
for more details. minModuleSize
minimum module size for module detection. See
cutreeDynamic
for more details. checkMinModuleSize
logical: should sanity checks be performed on minModuleSize
?
maxCoreScatter
maximum scatter of the core for a branch to be a cluster, given as the fraction
of cutHeight
relative to the 5th percentile of joining heights. See
cutreeDynamic
for more minGap
minimum cluster gap given as the fraction of the difference between cutHeight
and
the 5th percentile of joining heights. See cutreeDynamic
for more details. maxAbsCoreScatter
maximum scatter of the core for a branch to be a cluster given as absolute
heights. If given, overrides maxCoreScatter
. See cutreeDynamic
for more details. minAbsGap
minimum cluster gap given as absolute height difference. If given, overrides
minGap
. See cutreeDynamic
for more details. pamStage
logical. If TRUE, the second (PAM-like) stage of module detection will be performed.
See cutreeDynamic
for more details. pamRespectsDendro
Logical, only used when pamStage
is TRUE
.
If TRUE
, the PAM stage will
respect the dendrogram in the sense an object can be PAM-assigned only to clusters that lie below it on
the branch that the object is merged in
minKMEtoJoin
a number between 0 and 1. Genes with eigengene connectivity higher than
minKMEtoJoin
are automatically assigned to their closest module.
minCoreKME
a number between 0 and 1. If a detected module does not have at least
minModuleKMESize
genes with eigengene connectivity at least minCoreKME
, the module is
disbanded (its genes are unlabeled and returned to the pool of genes wa
minCoreKMESize
see minCoreKME
above.
minKMEtoStay
genes whose eigengene connectivity to their module eigengene is lower than
minKMEtoStay
are removed from the module.
reassignThresholdPS
per-set p-value ratio threshold for reassigning genes between modules.
See Details.
mergeCutHeight
dendrogram cut height for module merging.
impute
logical: should imputation be used for module eigengene calculation? See
moduleEigengenes
for more details. getTOMs
deprecated, please use saveTOMs below.
saveTOMs
logical: should the consensus topological overlap matrices for each block be saved
and returned?
saveTOMFileBase
character string containing the file name base for files containing the
consensus topological overlaps. The full file names have "block.1.RData"
, "block.2.RData"
etc. appended. These files are standard R data files and can be l
getTOMScalingSamples
logical: should samples used for TOM scaling be saved for future analysis?
This option is only available when sampleForScaling
is TRUE
.
trapErrors
logical: should errors in calculations be trapped?
checkPower
logical: should basic sanity check be performed on the supplied power
? If
you would like to experiment with unusual powers, set the argument to FALSE
and proceed with
caution.
numericLabels
logical: should the returned modules be labeled by colors (FALSE
), or by
numbers (TRUE
)?
checkMissingData
logical: should data be checked for excessive numbers of missing entries in
genes and samples, and for genes with zero variance? See details.
maxPOutliers
only used for corType=="bicor"
. Specifies the maximum percentile of data
that can be considered outliers on either
side of the median separately. For each side of the median, if
higher percentile than maxPOutliers
is conside
quickCor
real number between 0 and 1 that controls the handling of missing data in the
calculation of correlations. See details.
pearsonFallback
Specifies whether the bicor calculation, if used, should revert to Pearson when
median absolute deviation (mad) is zero. Recongnized values are (abbreviations of)
"none", "individual", "all"
. If set to
"none"
, zero mad will re
cosineCorrelation
logical: should the cosine version of the correlation calculation be used? The
cosine calculation differs from the standard one in that it does not subtract the mean.
nThreads
non-negative integer specifying the number of parallel threads to be used by certain
parts of correlation calculations. This option only has an effect on systems on which a POSIX thread
library is available (which currently includes Linux and Mac OSX, b
verbose
integer level of verbosity. Zero means silent, higher values make the output
progressively more and more verbose.
indent
indentation for diagnostic messages. Zero means no indentation, each unit adds
two spaces.