preprocessInf:
Preprocessing of Illumina Infinium II arrays.

Description

This function normalizes the intensities for the 'A' and 'B' alleles for a CNSet object and estimates mixture parameters used for subsequent genotyping. See details for how the normalized intensities are written to file. This step is required for subsequent genotyping and copy number estimation.

Usage

preprocessInf(cnSet, sampleSheet=NULL, arrayNames = NULL, ids = NULL,
path = ".", arrayInfoColNames = list(barcode = "SentrixBarcode_A",
position = "SentrixPosition_A"), highDensity = TRUE, sep = "_", fileExt
= list(green = "Grn.idat", red = "Red.idat"), XY, saveDate = TRUE, stripNorm
= TRUE, useTarget = TRUE, mixtureSampleSize = 10^5, fitMixture = TRUE, 
quantile.method="between", eps = 0.1, verbose = TRUE, seed = 1, cdfName)

Arguments

cnSet

object of class CNSet

sampleSheet

data.frame containing Illumina sample sheet information (for required columns, refer to BeadStudio Genotyping guide - Appendix A).

arrayNames

character vector containing names of arrays to be read in. If NULL, all arrays that can be found in the specified working directory will be read in.

ids

vector containing ids of probes to be read in. If NULL all probes found on the first array are read in.

path

character string specifying the location of files to be read by the function

arrayInfoColNames

(used when sampleSheet is specified) list containing elements 'barcode' which indicates column names in the sampleSheet which contains the arrayNumber/barcode number and 'position' which indicates the strip number. In older style sample sheets, this information is combined (usually in a column named 'SentrixPosition') and this should be specified as list(barcode=NULL, position="SentrixPosition")

highDensity

logical (used when sampleSheet is specified). If TRUE, array extensions '\_A', '\_B' in sampleSheet are replaced with 'R01C01', 'R01C02' etc.

sep

character string specifying separator used in .idat file names.

fileExt

list containing elements 'Green' and 'Red' which specify the .idat file extension for the Cy3 and Cy5 channels.

an NChannelSet object containing X and Y intensities.

saveDate

'logical'. Should the dates from each .idat be saved with sample information?

stripNorm

'logical'. Should the data be strip-level normalized?

useTarget

'logical' (only used when stripNorm=TRUE). Should the reference HapMap intensities be used in strip-level normalization?

mixtureSampleSize

Sample size to be use when fitting the mixture model.

fitMixture

'logical.' Whether to fit per-array mixture model.

quantile.method

character string specifying the quantile normalization method to use ('within' or 'between' channels).

eps

Stop criteria.

verbose

'logical.' Whether to print descriptive messages during processing.

seed

Seed to be used when sampling. Useful for reproducibility

cdfName

character string indicating which annotation package to load.

Value

A ff_matrix object containing parameters for fitting the mixture model. Note that while the CNSet object is not returned by this function, the object will be updated as the normalized intensities are written to disk. In particular, after applying this function the normalized intensities in the alleleA and alleleB elements of assayData are now available.

Details

The normalized intensities are written to disk using package ff protocols for writing/reading to disk. Note that the object CNSet containing the ff objects in the assayData slot will be updated after applying this function.

Examples

Run this code

	## See the 'illumina_copynumber' vignette in inst/scripts of
	## the source package

Run the code above in your browser using DataLab