Learn R Programming

crlmm (version 1.30.0)

crlmmIlluminaV2: Read and Genotype Illumina Infinium II BeadChip data with CRLMM

Description

Implementation of the CRLMM algorithm for data from Illumina's Infinium II BeadChips.

Usage

crlmmIlluminaV2(sampleSheet=NULL, arrayNames=NULL, ids=NULL, path=".", arrayInfoColNames=list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), highDensity=FALSE, sep="_", fileExt=list(green="Grn.idat", red="Red.idat"), saveDate=FALSE, stripNorm=TRUE, useTarget=TRUE, row.names=TRUE, col.names=TRUE, probs=c(1/3, 1/3, 1/3), DF=6, SNRMin=5, gender=NULL, seed=1, mixtureSampleSize=10^5, eps=0.1, verbose=TRUE, cdfName, sns, recallMin=10, recallRegMin=1000, returnParams=FALSE, badSNP=.7)

Arguments

sampleSheet
data.frame containing Illumina sample sheet information (for required columns, refer to BeadStudio Genotyping guide - Appendix A).
arrayNames
character vector containing names of arrays to be read in. If NULL, all arrays that can be found in the specified working directory will be read in.
ids
vector containing ids of probes to be read in. If NULL all probes found on the first array are read in.
path
character string specifying the location of files to be read by the function
arrayInfoColNames
(used when sampleSheet is specified) list containing elements 'barcode' which indicates column names in the sampleSheet which contains the arrayNumber/barcode number and 'position' which indicates the strip number. In older style sample sheets, this information is combined (usually in a column named 'SentrixPosition') and this should be specified as list(barcode=NULL, position="SentrixPosition")
highDensity
logical (used when sampleSheet is specified). If TRUE, array extensions '\_A', '\_B' in sampleSheet are replaced with 'R01C01', 'R01C02' etc.
sep
character string specifying separator used in .idat file names.
fileExt
list containing elements 'Green' and 'Red' which specify the .idat file extension for the Cy3 and Cy5 channels.
saveDate
'logical'. Should the dates from each .idat be saved with sample information?
stripNorm
'logical'. Should the data be strip-level normalized?
useTarget
'logical' (only used when stripNorm=TRUE). Should the reference HapMap intensities be used in strip-level normalization?
row.names
'logical'. Use rownames - SNP names?
col.names
'logical'. Use colnames - Sample names?
probs
'numeric' vector with priors for AA, AB and BB.
DF
'integer' with number of degrees of freedom to use with t-distribution.
SNRMin
'numeric' scalar defining the minimum SNR used to filter out samples.
gender
'integer' vector, with same length as 'filenames', defining sex. (1 - male; 2 - female)
seed
'integer' scalar for random number generator (used to sample mixtureSampleSize SNPs for mixture model.
mixtureSampleSize
'integer'. The number of SNP's to be used when fitting the mixture model.
eps
Minimum change for mixture model.
verbose
'logical'.
cdfName
'character' defining the chip annotation (manifest) to use ('human370v1c', human550v3b', 'human650v3a', 'human1mv1c', 'human370quadv3c', 'human610quadv1b', 'human660quadv1a', 'human1mduov3b', 'humanomni1quadv1b', 'humanomniexpress12v1b', 'humancytosnp12v2p1h')
sns
'character' vector with sample names to be used.
recallMin
'integer'. Minimum number of samples for recalibration.
recallRegMin
'integer'. Minimum number of SNP's for regression.
returnParams
'logical'. Return recalibrated parameters.
badSNP
'numeric'. Threshold to flag as bad SNP (affects batchQC)

Value

A SnpSet object which contains
calls
Genotype calls (1 - AA, 2 - AB, 3 - BB)
callProbability
confidence scores 'round(-1000*log2(1-p))'
in the assayData slot and
SNPQC
SNP Quality Scores
batchQC
Batch Quality Scores
along with center and scale parameters when returnParams=TRUE in the featureData slot.

Details

This function combines the reading of data from idat files using readIdatFiles and genotyping to reduce memory usage.

References

Ritchie ME, Carvalho BS, Hetrick KN, Tavar\'e S, Irizarry RA. R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips. Bioinformatics. 2009 Oct 1;25(19):2621-3.

Carvalho B, Bengtsson H, Speed TP, Irizarry RA. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics. 2007 Apr;8(2):485-99. Epub 2006 Dec 22. PMID: 17189563.

Carvalho BS, Louis TA, Irizarry RA. Quantifying uncertainty in genotype calls. Bioinformatics. 2010 Jan 15;26(2):242-9.

See Also

crlmmIllumina

Examples

Run this code
## crlmmOut = crlmmIlluminaV2(samples,path=path,arrayInfoColNames=list(barcode="Chip",position="Section"),
##                             saveDate=TRUE,cdfName="human370v1c",returnParams=TRUE)

Run the code above in your browser using DataLab