dyebias.apply.correction(data.norm, iGSDBs, estimator.subset=TRUE, application.subset=TRUE, dyebias.percentile=5, minmaxA.perc=25, minA.abs=NULL, maxA.abs=NULL, verbose=FALSE)marrayNorm object containing the data whose dye bias should
be corrected. This object must be a complete marrayNorm object. In particular,
maLabels(maGnames(data.norm)) should be set and indicate the
identities of the spots. Spots with the same ID should contain the same
oligo or cDNA sequence, and will receive the same dye bias correction.
dyebias.estimate.iGSDBs, but this is
not necessary; other estimates can also be used.The data frame must have (at least) the following columns:
maLabels(maGnames(data.norm))
minmaxA.perc,
minA.abs, maxA.abs arguments are still applied).
The order of the rows in this data frame is irrelevant. There must
be no rows with duplicate reporterId in this frame.
For any reporter in data.norm that is not in the
iGSDBs data frame, an iSGDB of 0.00 is used, i.e. data from
such reporters is not dye bias-corrected.
An index indicating which reporters are fit to be used as estimators of the slide bias. This set of reporters is used throughout the whole data set. Reporters that are typically excluded are those corresponding to parasitic DNA elements or mitochondrial genes.
maM(data.norm). In former case, the selected spots on all
slides with be dye bias-corrected; in the latter, selected spots on
selected slides will corrected. Often it is prudent not to dye bias-correct measurements that are
close to the detection limit or close to signal saturation. A
convenience function for this is provided; see
dyebias.application.subset.
To obtain a robust estimate of the slide bias, the range of the
average expression $A$ is trimmed by minmaxA.perc percent
on both sides; only reporters lying inside this trimmed range are
considered as estimators of the slide bias. The default value is 25,
meaning that only probes with an average expression within the
interquartile range are considered as estimator genes (from these,
the top dyebias.percentile red- and green-biased are then
actually used). The default value should suffice in practically all
cases.
If specified, reporters with an average expression
($A$) lower than this value are never considered as estimators
of the slide bias. If not specified, reporters with an
$A$-percentile < minmaxA.perc are not considered.
100-minmaxA.perc are not considered.marrayNorm object of the same 'shape' as
the input data.norm, but with corrected $M$ values.
estimators list are:
application.subset are considered.
var.ratio, but expressed as a percentage.
The larger this value, the greater the correction.
marrayNorm, for convenienceThis function corrects the gene-specific dye bias of two-colour microarrays with the GASSCO method. This method is general, robust and fast, and is based on the observation that the total bias per gene is the product of a slide-specific factor (strongly related to the labeling percentage) and an intrinsic gene-specific factor (iGSDB), which is strongly related to the probe sequence.
The slide bias is estimated from the total bias of the
dyebias.percentile percentage of reporters having the strongest
iGSDB. The iGSDBs can be estimated with
dyebias.estimate.iGSDBs.
If the signal of certain oligos is too weak, or in contrast, tends to
be saturated, they are no good estimator of the slide bias.
Therefore, only reporters with an average expression level $A$
that is not too extreme are allowed to be slide bias estimators. (This
is the reason for the A-column in the iGSDBs data
frame).
Full control over which reporters to allow as slide bias estimators is
given by the arguments minmaxA.perc, minA.abs, and
maxA.abs; see there for details. To not exclude any reporter
(e.g., when $A$ is not available and therefore artificially set),
you can use minA.abs= -Inf and maxA.abs = Inf.
For further details concerning the method, see the dyebias
vignette and the publication. If your research benefits from using this
package, we kindly request that you cite this work.
dyebias.estimate.iGSDBs,
dyebias.application.subset,
dyebias.rgplot,
dyebias.maplot,
dyebias.boxplot,
dyebias.trendplot
## First load data and estimate the iGSDBs
## (see dyebias.estimate.iGSDBs)
### choose the estimators and which spots to correct:
estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm)))
### choose which genes to dye bias correct:
application.subset <- (maW(data.norm) == 1 &
dyebias.application.subset(data.raw=data.raw, use.background=TRUE))
### do the correction:
correction <- dyebias.apply.correction(data.norm=data.norm,
iGSDBs = iGSDBs.estimated,
estimator.subset=estimator.subset,
application.subset = application.subset,
verbose=FALSE)
## Not run:
# edit(correction$summary)
# ## End(Not run)
## give overview:
correction$summary[,c("slide", "file", "avg.correction", "reduction.perc", "p.value")]
## and summary:
summary(as.numeric(correction$summary[, "reduction.perc"]))
Run the code above in your browser using DataLab