Learn R Programming

ArrayExpress (version 1.32.0)

prepareAnnotation: Prepare annotation data for the RNA-Seq Pipeline

Description

prepareAnnotation downloads the required annotation file for the selected organism from Ensembl and processes it so that it can be used by the pipeline. prepareAnnotation requires an Internet connection.

Usage

prepareAnnotation(organism, version = "current", 
        location = getDefaultReferenceDir(), refresh = FALSE, run = TRUE)

Arguments

organism
supported organism names can be viewed in the Ensemble database. Check 'ftp://ftp.ensembl.org/pub'.
version
"current" or other appropriate version. Check 'ftp://ftp.ensembl.org/pub'.
location
indicates where the annotation data should be stored.
refresh
if TRUE, existing annotation data will be rebuilt.
run
if FALSE, the commands to obtain and process the annotation will not be executed.

Value

  • The output is the version of the organism annotation that has been downloaded and processed. The annotation files are kept in the folder defined in location parameter.

See Also

ArrayExpressHTS, ArrayExpressHTSFastQ, prepareReference

Examples

Run this code
if (isRCloud()) { # disabled on local configs 
    # so as not to affect package building process
    
    par(ask = FALSE)

    # the following piece of code will take ~1.5 hours to complete
    #
    
    # if executed on a local PC, make sure tools are available
    # to the pipeline.
    # 
    
    # create directory
    #
    # Please note, tempdir() is used for automamtic test 
    # execution. Select directory more appropriate and 
    # suitable for keeping reference data.
    #
    referencefolder = paste(tempdir(), "/reference", sep = "")
    
    dir.create(referencefolder)
    
    # download and prepare annotation
    prepareAnnotation("Homo_sapiens", "current", location = referencefolder)
    prepareAnnotation("Mus_musculus", "NCBIM37.61", location = referencefolder)

}

Run the code above in your browser using DataLab