Learn R Programming

crlmm (version 1.30.0)

preprocessInf:

Preprocessing of Illumina Infinium II arrays.

Description

This function normalizes the intensities for the 'A' and 'B' alleles for a CNSet object and estimates mixture parameters used for subsequent genotyping. See details for how the normalized intensities are written to file. This step is required for subsequent genotyping and copy number estimation.

Usage

preprocessInf(cnSet, sampleSheet=NULL, arrayNames = NULL, ids = NULL, path = ".", arrayInfoColNames = list(barcode = "SentrixBarcode_A", position = "SentrixPosition_A"), highDensity = TRUE, sep = "_", fileExt = list(green = "Grn.idat", red = "Red.idat"), XY, saveDate = TRUE, stripNorm = TRUE, useTarget = TRUE, mixtureSampleSize = 10^5, fitMixture = TRUE, quantile.method="between", eps = 0.1, verbose = TRUE, seed = 1, cdfName)

Arguments

cnSet
object of class CNSet
sampleSheet
data.frame containing Illumina sample sheet information (for required columns, refer to BeadStudio Genotyping guide - Appendix A).
arrayNames
character vector containing names of arrays to be read in. If NULL, all arrays that can be found in the specified working directory will be read in.
ids
vector containing ids of probes to be read in. If NULL all probes found on the first array are read in.
path
character string specifying the location of files to be read by the function
arrayInfoColNames
(used when sampleSheet is specified) list containing elements 'barcode' which indicates column names in the sampleSheet which contains the arrayNumber/barcode number and 'position' which indicates the strip number. In older style sample sheets, this information is combined (usually in a column named 'SentrixPosition') and this should be specified as list(barcode=NULL, position="SentrixPosition")
highDensity
logical (used when sampleSheet is specified). If TRUE, array extensions '\_A', '\_B' in sampleSheet are replaced with 'R01C01', 'R01C02' etc.
sep
character string specifying separator used in .idat file names.
fileExt
list containing elements 'Green' and 'Red' which specify the .idat file extension for the Cy3 and Cy5 channels.
XY
an NChannelSet object containing X and Y intensities.
saveDate
'logical'. Should the dates from each .idat be saved with sample information?
stripNorm
'logical'. Should the data be strip-level normalized?
useTarget
'logical' (only used when stripNorm=TRUE). Should the reference HapMap intensities be used in strip-level normalization?
mixtureSampleSize
Sample size to be use when fitting the mixture model.
fitMixture
'logical.' Whether to fit per-array mixture model.
quantile.method
character string specifying the quantile normalization method to use ('within' or 'between' channels).
eps
Stop criteria.
verbose
'logical.' Whether to print descriptive messages during processing.
seed
Seed to be used when sampling. Useful for reproducibility
cdfName
character string indicating which annotation package to load.

Value

A ff_matrix object containing parameters for fitting the mixture model. Note that while the CNSet object is not returned by this function, the object will be updated as the normalized intensities are written to disk. In particular, after applying this function the normalized intensities in the alleleA and alleleB elements of assayData are now available.

Details

The normalized intensities are written to disk using package ff protocols for writing/reading to disk. Note that the object CNSet containing the ff objects in the assayData slot will be updated after applying this function.

See Also

CNSet-class, A, B, constructInf, genotypeInf, annotationPackages

Examples

Run this code
	## See the 'illumina_copynumber' vignette in inst/scripts of
	## the source package

Run the code above in your browser using DataLab