Learn R Programming

metagene (version 1.0.0)

parseFeatures: Parse an experiment using a list of features

Description

This function produces the list object that contains all the information necessary to produce a metagene-like plot with the plotGraphic function. Currently supported species are: “mouse”, “human” (default).

Usage

parseFeatures( bamFiles, features=NULL, specie="human", maxDistance=5000, design=NULL, cores=1, debug=FALSE)

Arguments

bamFiles
A vector of BAM files to plot. All BAM files must exist.
features
Either a filename of a vector of filenames. Supported features: ensembl_gene_id. If value is NULL, all known RefSeq genes will be used.
specie
human: Homo sapiens (default). mouse: Mus musculus.
maxDistance
The distance around feature to include in the plot. The maximum distance has to be a positive integer.
design
A data.frame explaining the relationship between multiple samples. One line per samples. One column per group of samples. For example, biological replicates and corresponding controls are in the same group. 1: treatment file(s). 2: control file(s).
cores
Number of cores for parallel processing. Require parallel package. The number of cores has to be a positive integer.
debug
Keep the intermediate files (can use a lot of memory). TRUE or FALSE.

Value

parseFeatures returns a list that contains the data necessary to produce a plot.The data structure is a list of lists.The first level contain the following fields:
  • design: The information from the design file.
  • param: The values of the argument used with parseFeatures.
  • bamFilesDescription: A data.frame with the following columns:
    • bam: The names of the sorted bam files
    • oldBam: The names of the original bam files
    • alignedCount: The number of aligned reads
  • matrix: A list of matrix that will be used to produce the plot. One element by combination of features/design groups.

Details

This function will extract the read density from alignments files (bam) in the viscinity of transcription start sites of one or multiple list of genes.

The values are normalized as read per millions aligned (RPM).

It is possible to parse multiple groups of gene by saving each list in a separate file and by listing the file names in a vector as the features parameter.

By using the design parameter, the parseFeatures function will deal with more complex experimental design such as the use of replicates and/or controls. The values of controls are substracted from every replicates.

Examples

Run this code
  bamFileName <- system.file("extdata/align1_rep1.bam", package="metagene")
  featuresFileName <- system.file("extdata/list1.txt", package="metagene")
  groups <- parseFeatures(bamFileName, featuresFileName, specie="mouse")

Run the code above in your browser using DataLab