normalize.qspline: Normalize arrays

Description

normalizes arrays in an AffyBatch each other or to a set of target intensities

Usage

normalize.AffyBatch.qspline(abatch,type=c("together", "pmonly", "mmonly", "separate"), ...)
normalize.qspline(x, target = NULL, samples = NULL,  fit.iters = 5, min.offset = 5,  spline.method = "natural", smooth = TRUE, spar = 0, p.min = 0, p.max = 1.0,  incl.ends = TRUE, converge = FALSE,  verbose = TRUE, na.rm = FALSE)

Arguments

a data.matrix of intensities

abatch

an AffyBatch

target

numerical vector of intensity values to normalize to. (could be the name for one of the celfiles in 'abatch').

samples

numerical, the number of quantiles to be used for spline. if (0,1], then it is a sampling rate.

fit.iters

number of spline interpolations to average.

min.offset

minimum span between quantiles (rank difference) for the different fit iterations.

spline.method

specifies the type of spline to be used. Possible values are `"fmm"', `"natural"', and `"periodic"'.

smooth

logical, if `TRUE', smoothing splines are used on the quantiles.

spar

smoothing parameter for `splinefun', typically in (0,1].

p.min

minimum percentile for the first quantile.

p.max

maximum percentile for the last quantile.

incl.ends

include the minimum and maximum values from the normalized and target arrays in the fit.

converge

(currently unimplemented)

verbose

logical, if `TRUE' then normalization progress is reported.

na.rm

logical, if `TRUE' then handle NA values (by ignoring them).

type

a string specifying how the normalization should be applied. See details for more.

...

optional parameters to be passed through.

Value

a normalized AffyBatch.

Details

This normalization method uses the quantiles from each array and the target to fit a system of cubic splines to normalize the data. The target should be the mean (geometric) or median of each probe but could also be the name of a particular chip in the abatch object.

Parameters setting can be of much importance when using this method. The parameter fit.iter is used as a starting point to find a more appropriate value. Unfortunately the algorithm used do not converge in some cases. If this happens, the fit.iter value is used and a warning is thrown. Use of different settings for the parameter samples was reported to give good results. More specifically, for about 200 data points use samples = 0.33, for about 2000 data points use samples = 0.05, for about 10000 data points use samples = 0.02 (thanks to Paul Boutros). The type argument should be one of "separate","pmonly","mmonly","together" which indicates whether to normalize only one probe type (PM,MM) or both together or separately.

References

Christopher Workman, Lars Juhl Jensen, Hanne Jarmer, Randy Berka, Laurent Gautier, Henrik Bjorn Nielsen, Hans-Henrik Saxild, Claus Nielsen, Soren Brunak, and Steen Knudsen. A new non-linear normal- ization method for reducing variability in dna microarray experiments. Genome Biology, accepted, 2002