duplicateCorrelation(object, design=NULL, ndups=2, spacing=1, block=NULL,
trim=0.15, weights=NULL)
as.matrix
will extract a suitable matrix such as an MAList
, marrayNorm
or ExpressionSet
object.
If object
is an MAList
object then the arguments design
, ndups
, spacing
and weights
will be extracted from it if available and do not have to be specified as arguments.
Specifying these arguments explicitly will over-rule any components found in the data object.object
. Defaults to the unit vector meaning that the arrays are treated as replicates.nrow(object)
must be divisible by ndups
.
Will be ignored if block
is specified.object
corresponding to duplicate spots, spacing=1
for consecutive spotstanh(all.correlations)
when computing the trimmed mean.object
containing weights for each spot. If smaller than object
then it will be filled out the same size.consensus.correlation
, for compatibility with earlier versions of the softwarenrow(object)/ndups
giving the individual genewise atanh-transformed correlations.block=NULL
, this function estimates the correlation between duplicate spots (regularly spaced within-array replicate spots).
If block
is not null, this function estimates the correlation between repeated observations on the blocking variable.
Typically the blocks are biological replicates and the repeated observations are technical replicates.
In either case, the correlation is estimated by fitting a mixed linear model by REML individually for each gene.
The function also returns a consensus correlation, which is a robust average of the individual correlations, which can be used as input for
functions lmFit
or gls.series
.
At this time it is not possible to estimate correlations between duplicate spots and between technical replicates simultaneously.
If block
is not null, then the function will set ndups=1
, which is equivalent to ignoring duplicate spots.
For this function to return statistically useful results, there must be at least two more arrays than the number of coefficients to be estimated, i.e., two more than the column rank of design
.
The function may take long time to execute as it fits a mixed linear model for each gene for an iterative algorithm.
It is not uncommon for the function to return a small number of warning messages that correlation estimates cannot be computed for some individual genes.
This is not a serious concern providing that there are only a few such warnings and the total number of genes is large.
The consensus estimator computed by this function will not be materially affected by a small number of genes.mixedModel2Fit
from the statmod package.
An overview of linear model functions in limma is given by 06.LinearModels.# Also see lmFit examples
corfit <- duplicateCorrelation(MA, ndups=2, design)
all.correlations <- tanh(corfit$atanh.correlations)
boxplot(all.correlations)
fit <- lmFit(MA, design, ndups=2, correlation=corfit$consensus)
Run the code above in your browser using DataLab