loadGPR: Importing raw data from gpr files.

Description

Constructs an EListRaw object from a set of gpr files containing ProtoArray data or other protein microarray data.

Usage

loadGPR(gpr.path = NULL, targets.path = NULL, array.type = NULL,  aggregation = "none", array.columns = list(E = "F635 Median", Eb = "B635 Median"), array.annotation = c("Block", "Column", "Row", "Description", "Name", "ID"), description = NULL, description.features = NULL, description.discard = NULL)

Arguments

gpr.path

string indicating the path to a folder containing gpr files (mandatory).

targets.path

string indicating the path to targets file (see limma, mandatory).

array.type

string indicating the microarray type of the imported gpr files. Only for ProtoArrays duplicate aggregation will be performed. The possible options are: "ProtoArray", "HuProt" and "other" (mandatory).

aggregation

string indicating which type of ProtoArray spot duplicate aggregation should be performed. If "min" is chosen, the value for the corresponding feature will be the minimum of both duplicate values. If "mean" is chosen, the arithmetic mean will be computed. Alternatively, no aggregation will be performed, if "none" is chosen. The default is "min" (optional).

array.columns

list containing the column names for foreground intensities (E) and background intensities (Eb) in the gpr files that is passed to limma's "read.maimages" function (optional).

array.annotation

string vector containing further mandatory column names that are passed to limma (optional).

description

string indicating the column name of an alternative column containing the information which spot is a feature, control or to be discarded for gpr files not providing the column "Description" (optional).

description.features

string containing a regular expression identifying feature spots. Mandatory when description has been defined.

description.discard

string containing a regular expression identifying spots to be discarded (e.g., empty spots). Mandatory when description has been defined.

Value

An extended object of class EListRaw (see the documentation of limma for details) is returned. If array.type is set to "ProtoArray" (default), the object provides additional components for control spot data: C, Cb and cgenes which are analogous to the probe spot data E, Eb and genes. Moreover, the returned object always provides the additional component array.type indicating the type of the imported protein microarray data (e.g., "ProtoArray").

Details

This function is partially a wrapper to limma's function read.maimages() featuring optional duplicate aggregation for ProtoArray data. Paths to a targets file and to a folder containing gpr files (all gpr files in that folder that are listed in the targets file will be read) are mandatory. The folder "R_HOME/library/PAA/extdata" contains an exemplary targets file that can be used as a template. If array.type (also mandatory) is set to "ProtoArray", duplicate spots can be aggregated. The corresponding method ("min", "mean" or "none") can be specified via the argument aggregation. As another ProtoArray-specific feature, control spot data and information will be stored in additional components of the returned object (see below). Arguments array.columns and array.annotation define the columns where read.maimages() will find foreground and background intensity values as well as other important columns. For array.annotation the default columns "Block", "Column", "Row", "Description", "Name" and "ID" are mandatory.

If the column "Description" is not provided by the gpr files for ProtoArrays a makeshift column will be constructed from the column "Name" automatically. For other microarrays the arguments description, description.features and description.discard can be used to provide the mandatory information (see the example below).

References

The package limma by Gordon Smyth et al. can be downloaded from Bioconductor (http://www.bioconductor.org/).

Smyth, G. K. (2005). Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions using R and Bioconductor, R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds.), Springer, New York, pages 397-420.

Examples

Run this code

gpr <- system.file("extdata", package="PAA") 
targets <- list.files(system.file("extdata", package="PAA"),
 pattern = "dummy_targets", full.names=TRUE)   
elist <- loadGPR(gpr.path=gpr, targets.path=targets, array.type="ProtoArray")

# Example showing how to use the arguments description, description.features and
# description.discard in order to construct a makeshift column 'Description'
# for gpr files without this column. Please see also the exemplary gpr files
# coming with PAA.  
targets2 <- list.files(system.file("extdata", package="PAA"),
 pattern = "dummy_no_descr_targets", full.names=TRUE)
elist2 <- loadGPR(gpr.path=gpr, targets.path=targets2, array.type="other",
 description="Name", description.features="^Hs~", description.discard="Empty")

Run the code above in your browser using DataLab