annotateEset: Method to annotate ExpressionSets automatically

Description

This function fills the featureData slot of the ExpressionSet automatically, which is then available to downstream methods to provide annotated output. Annotating results is tedious, and can be surprisingly difficult to get right. By annotating the data automatically, we remove the tedium and add an extra layer of security since the resulting ExpressionSet will be tested for validity automatically (e.g., annotation data match up correctly with the expression data). Current choices for the annoation data are a ChipDb object (e.g., hugene10sttranscriptcluster.db) or an AffyGenePDInfo object (e.g., pd.hugene.1.0.st.v1). In the latter case, we use the parsed Affymetrix annotation csv file to get data. This is only intended for those situations where the ChipDb package is not available.

Usage

annotateEset(object, x, ...)
## S3 method for class 'ExpressionSet,ChipDb':
annotateEset(object, x,
  columns = c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"),
  multivals = "first")
## S3 method for class 'ExpressionSet,AffyGenePDInfo':
annotateEset(object, x,
  type = "core", ...)
## S3 method for class 'ExpressionSet,AffyHTAPDInfo':
annotateEset(object, x, type = "core",
  ...)
## S3 method for class 'ExpressionSet,AffyExonPDInfo':
annotateEset(object, x,
  type = "core", ...)
## S3 method for class 'ExpressionSet,character':
annotateEset(object, x, ...)
## S3 method for class 'ExpressionSet,data.frame':
annotateEset(object, x, probecol = NULL,
  annocols = NULL, ...)

Arguments

object

An ExpressionSet to which we want to add annotation.

Either a ChipDb package (e.g., hugene10sttranscriptcluster.db), or a pdInfoPackage object (e.g., pd.hugene.1.0.st.v1).

...

Allow users to pass in arbitrary arguments. Particularly useful for passing in columns, multivals, and type arguments for methods.

columns

For ChipDb method; what annotation data to add. Use the columns function to see what choices you have. By default we get the ENTREZID, SYMBOL and GENENAME.

multivals

For ChipDb method; this is passed to mapIds to control how 1:many mappings are handled. The default is 'first', which takes just the first result. Other valid values are 'list' and 'CharacterList', which return all mapped results.

type

For pdInfoPackages; either 'core' or 'probeset', corresponding to the 'target' argument used in the call to rma.

probecol

Column of the data.frame that contains the probeset IDs. Can be either numeric (the column number) or character (the column header).

annocols

Column(x) of the data.frame to use for annotating. Can be a vector of numbers (which column numbers to use) or a character vector (vector of column names).

Value

An ExpressionSet that has annotation data added to the featureData slot.

Methods (by class)

object = ExpressionSet,x = ChipDb: Annotate an ExpressionSet using a ChipDb package for annotation data.
object = ExpressionSet,x = AffyGenePDInfo: Annotate an ExpressionSet using an AffyGenePDInfo package.
object = ExpressionSet,x = AffyHTAPDInfo: Annotate an ExpressionSet using an AffyHTAPDInfo package.
object = ExpressionSet,x = AffyExonPDInfo: Annotate an ExpressionSet using an AffyExonPDInfo package.
object = ExpressionSet,x = character: Method to capture character input.
object = ExpressionSet,x = data.frame: Annotate an ExpressionSet using a user-supplied data.frame.

Examples

Run this code

dat <- read.celfiles(filenames = list.celfiles())
eset <- rma(dat)
## annotate using ChipDb
eset <- annotateEset(eset, hgu10sttranscriptcluster.db)
## or AffyGenePDInfo
eset <- annotateEset(eset, pd.hugene.1.0.st.v1)

Run the code above in your browser using DataLab