limma (version 3.28.14)

read.idat: Read Illumina expression data directly from IDAT files

Description

Read Illumina BeadArray data from IDAT and manifest (.bgx) files for gene expression platforms.

Usage

read.idat(idatfiles, bgxfile, dateinfo = FALSE, annotation = "Symbol", tolerance = 0L, verbose = TRUE)

Arguments

idatfiles
character vector specifying idat files to be read in.
bgxfile
character string specifying bead manifest file (.bgx) to be read in.
dateinfo
logical. Should date and software version information be read in?
annotation
character vector of annotation columns to be read from the manifest file.
tolerance
integer. The number of probe ID discrepancies allowed between the manifest and any of the IDAT files.
verbose
logical. Should progress messages are sent to standard output?

Value

An EListRaw object with the following components:
E
numeric matrix of raw intensities.
other$NumBeads
numeric matrix of same dimensions as E giving number of beads used for each intensity value.
other$STDEV
numeric matrix of same dimensions as E giving bead-level standard deviation or standard error for each intensity value.
genes
data.frame of probe annotation. This includes the Probe_Id and Array_Address_Id columns extracted from the manifest file, plus a Status column identifying control probes, plus any other columns specified by annotation.
targets
data.frame of sample information. This includes the IDAT file names plus other columns if dateinfo=TRUE.

Details

Illumina's BeadScan/iScan software outputs probe intensities in IDAT format (encrypted XML files) and uses probe information stored in a platform specific manifest file (.bgx). These files can be processed using the low-level functions readIDAT and readBGX from the illuminaio package (Smith et al. 2013).

The read.idat function provides a convenient way to read these files into R and to store them in an EListRaw-class object. The function serves a similar purpose to read.ilmn, which reads text files exported by Illumina's GenomeStudio software, but it reads the IDAT files directly without any need to convert them first to text. The function reads information on control probes as well for regular probes. Probe types are indicated in the Status column of the genes component of the EListRaw object.

The annotation argument specifies probe annotation columns to be extracted from the manifest file. The manifest typically contains the following columns: "Species", "Source", "Search_Key", "Transcript", "ILMN_Gene", "Source_Reference_ID", "RefSeq_ID", "Unigene_ID", "Entrez_Gene_ID", "GI", "Accession", "Symbol", "Protein_Product", "Probe_Id", "Array_Address_Id", "Probe_Type", "Probe_Start", "Probe_Sequence", "Chromosome", "Probe_Chr_Orientation", "Probe_Coordinates", "Cytoband", "Definition", "Ontology_Component", "Ontology_Process", "Ontology_Function", "Synonyms", "Obsolete_Probe_Id". Note that "Probe_Id" and "Array_Address_Id" are always extracted and do not need to included in the annotation argument.

If more than tolerance probes in the manifest cannot be found in an IDAT file then the function will return an error.

References

Smith ML, Baggerly KA, Bengtsson H, Ritchie ME, Hansen KD (2013). illuminaio: An open source IDAT parsing tool. F1000 Research 2, 264. http://f1000research.com/articles/2-264/

See Also

read.ilmn imports gene expression data output by GenomeStudio.

neqc performs normexp by control background correction, log transformation and quantile between-array normalization for Illumina expression data.

propexpr estimates the proportion of expressed probes in a microarray. detectionPValues computes detection p-values from the negative controls.

Examples

Run this code
## Not run: 
# idatfiles <- dir(pattern="idat")
# bgxfile <- dir(pattern="bgx")
# x <- read.idat(idatfiles, bgxfile)
# x$other$Detection <- detectionPValues(x)
# propexpr(data)
# y <- neqc(data)
# ## End(Not run)

Run the code above in your browser using DataLab