Learn R Programming

Pasha (version 0.99.18)

multiread_CSEMDispatch: Multiread scoring CSEM dispatch

Description

This script aims to dispatch the scoring of multi-read aligned reads according to the CSEM algorithm developped by Chung et al. (see "Discovering Transcription Factor Binding Sites of Genomes with Multi-Read Analysis of ChIP-seq Data" (2011) PLoS Computational Biology).

Usage

multiread_CSEMDispatch(alignedFile, 
                               outputFolder, 
                               referenceFile,
                               window_size=101, 
                               iteration_number=200, 
                               incrArtefactThrEvery=NA, 
                               verbosity=0)

Arguments

alignedFile
An atomic character string. The full path to the file containing the reads aligned by bowtie with the --concise option.
outputFolder
An atomic character string. The path to the folder where the file output by the script must be stored.
referenceFile
An atomic character string. Either a full path to a reference file (see details for format specification), or the ID of one reference included in the package (see details for available ones).
window_size
A positive integer. The size of the window used by the algorithm (see algorithm details). Default value is 101.
iteration_number
A positive integer. The number of iteration executed by the algorithm (see algorithm details). Default value is 200.
incrArtefactThrEvery
A complex parameter (see details). A numeric value or NA. A strictly positive numeric value activate the option that allow to remove the 'artifacts', defining a threshold to consider piles like 'artifacts' as 'number of reads in the experiment de
verbosity
An integer. The verbose level : 0 = no message, 1 = trace level

Value

  • A tab separated value text file formated as below:
    • Column 1 : Chromosome name
  • Column 2 : Strand
  • Column 3 : Position
  • Column 4 : Score

Details

The script consider the reads that have been aligned in several location by bowtie (multi-reads). At each read, it assign a score determined by the CSEM algorithm (Chung et al. "Discovering Transcription Factor Binding Sites of Genomes with Multi-Read Analysis of ChIP-seq Data" (2011) PLoS Computational Biology). The script output a tab separated value text file formated as below:
  • Column 1 : Chromosome name
Column 2 : Strand Column 3 : Position Column 4 : Score

See Also

processPipeline multiread_RemoveArtifact multiread_UniformDispatch

Examples

Run this code
# Define input aligned file
my_aligned_file <- system.file("extdata",
                               "embededDataTest_MultiSignal.bow",
                               package="Pasha")

# Define the output folder
my_output_folder <- tempdir()

# Define the genome reference file
genome_reference_file <- system.file("resources",
                                     "mm9.ref",
                                     package="Pasha")

# Launch the script
multiread_CSEMDispatch(my_aligned_file, 
                       my_output_folder, 
                       genome_reference_file,
                       incrArtefactThrEvery=7000000, 
                       verbosity=1)

Run the code above in your browser using DataLab