Learn R Programming

ArrayExpress (version 1.32.0)

ArrayExpressHTS: ExpressionSet for RNA-Seq experiment submitted in ArrayExpress and ENA

Description

ArrayExpressHTS runs the RNA-Seq pipeline on a transcription profiling experiment available on the ArrayExpress database and produces an ExpressionSet R object. ArrayExpressHTS requires an Internet connection.

Usage

ArrayExpressHTS( accession,
            options = list (
                    stranded           = FALSE,
                    insize             = NULL,
                    insizedev          = NULL,
                    reference          = "genome",
                    aligner            = "tophat",
                    aligner_options    = NULL,
                    count_feature      = "transcript",
                    count_options      = "",
                    count_method       = "cufflinks",
                    filter             = TRUE,
                    filtering_options  = NULL ),
            usercloud = TRUE,
            rcloudoptions = list (
                    nnodes     = "automatic",
                    pool       = c("4G", "8G", "16G", "32G", "64G"),
                    nretries   = 4 ),
            steplist = c("align", "count", "eset"),
            dir = getwd(),
            refdir = getDefaultReferenceDir(),
            want.reports = TRUE, 
            stop.on.warnings = FALSE )

Arguments

accession
an ArrayExpress experiment accession identifier, e.g. "E-GEOD-16190"
options
defines pipeline options. See getDefaultProcessingOptions.
usercloud
defines if the R Cloud will be used to parallel experiment computation, if FALSE, experiment data files will be processed sequentially.
rcloudoptions
defines R Cloud options. See getDefaultRCloudOptions.
steplist
defines the steps the pipeline will perform.
dir
folder where experiment data will be stored and processed. Default is current directory.
refdir
the directory where reference data is located.
want.reports
defines if quality reports are produced. Reports usually make computation longer and eat up more memory. For faster computation use FALSE.
stop.on.warnings
self explanatory. Warnings are normally producesd when there are inconsistencies, which however would allow the result to be produced.

Value

  • The output is an object of class ExpressionSet containing expression values in assayData (corresponding to the raw sequencing data files), the information contained in the .sdrf file in phenoData, the information in the adf file in featureData and the idf file content in experimentData. If executed on a local PC, make sure that tools are available to the pipeline. Check prepareAnnotation to see what needs to be done to make tools available.

See Also

ArrayExpressHTSFastQ, prepareReference, prepareAnnotation, prepareAnnotation getDefaultProcessingOptions, getPipelineOptions,

Examples

Run this code
if (isRCloud()) { # disabled on local configs
    # so as not to affect package building process

    # if executed on a local PC, make sure tools
    # are available to the pipeline.
    expfolder = tempdir();
    
    # run the pipeline
    #
    aehts = ArrayExpressHTS("E-GEOD-16190", dir = expfolder);
    
    # load the expression set object
    loadednames = load(paste(expfolder, 
                        "/E-GEOD-16190/eset_notstd_rpkm.RData", sep=""));
    loadednames;

    get('library')(Biobase);

    # print out the expression values
    #
    head(assayData(eset)$exprs);

    # print out the experiment meta data 
    experimentData(eset);
    pData(eset);
}

Run the code above in your browser using DataLab