Learn R Programming

Basic4Cseq (version 1.8.0)

prepare4CseqData: Alignment and filtering of raw 4C-seq data

Description

This function is an optional wrapper for the alignment and preliminary filtering of 4C-seq data. prepare4CseqData reads a provided 4C-seq fastq file from hard disk. Alignment of the reads is done with BWA, the function checkRestrictionEnzymeSequence is used for optional filtering. Samtools and bedtools provide the necessary functionality for intersecting the filtered reads with a given 4C-seq fragment library for visualization purposes (e.g. with the Integrative Genomics Viewer, IGV).

Usage

prepare4CseqData(fastqFileName, firstCutter, fragmentLibrary, referenceGenome, pathToBWA = "", pathToSam = "", pathToBED = "", controlCutterSequence = FALSE, bwaThreads = 1, minFragEndLength = 0)

Arguments

fastqFileName
The name of the fastq file that contains the 4C-seq reads
firstCutter
First cutting enzyme sequence for the 4C-seq experiment, e.g. "AAGCTT"
fragmentLibrary
Name of the fragment library to use for the current 4C-seq experiment; has to correspond to the chosen cutters and chosen genome
referenceGenome
Name (plus path) of the reference genome to use
pathToBWA
Path to BWA
pathToSam
Path to samtools
pathToBED
Path to bedtools
controlCutterSequence
If TRUE, the function checkRestrictionEnzymeSequence is used to filter non-valid 4C-seq reads
bwaThreads
Number of BWA threads
minFragEndLength
Minimum fragment end length to use for BED export

References

Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler Transform, Bioinformatics, 25, 1754-60.

Helga Thorvaldsdottir, James T. Robinson, Jill P. Mesirov. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics 2012.

See Also

checkRestrictionEnzymeSequence

Examples

Run this code
  if(interactive()) {
    # BWA, samtools and bedtools must be installed
    # It is assumed that the example data files (from the package) are in the active directory
    prepare4CseqData("veryShortExample.fastq", "CATG", "veryShortLib.csv", referenceGenome = "veryShortReference.fasta")
  }

Run the code above in your browser using DataLab