Learn R Programming

mosaics (version 2.10.0)

generateWig: Construct wiggle files from an aligned ChIP-sep read file

Description

Construct wiggle files from an aligned ChIP-sep read file.

Usage

generateWig( infile=NULL, fileFormat=NULL, outfileLoc="./", byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL, PET=FALSE, fragLen=200, span=200, capping=0, normConst=1, perl = "perl" )

Arguments

infile
Name of the aligned read file to be processed.
fileFormat
Format of the aligned read file to be processed. Currently, generateWig permits the following aligned read file formats for SET data (PET = FALSE): "eland_result" (Eland result), "eland_extended" (Eland extended), "eland_export" (Eland export), "bowtie" (default Bowtie), "sam" (SAM), "bam" (BAM), "bed" (BED), and "csem" (CSEM). For PET data (PET = TRUE), the following aligned read file formats are allowed: "eland_result" (Eland result), "sam" (SAM), and "bam" (BAM).
outfileLoc
Directory of processed wiggle files. By default, processed wiggle files are exported to the current directory.
byChr
Construct separate wiggle file for each chromosome? Possible values are TRUE or FALSE. If byChr=FALSE, all chromosomes are exported to one file. If byChr=TRUE, each chromosome is exported to a separate file. Default is FALSE.
useChrfile
Is the file for chromosome info provided? Possible values are TRUE or FALSE. If useChrfile=FALSE, it is assumed that the file for chromosome info is not provided. If useChrfile=TRUE, it is assumed that the file for chromosome info is provided. Default is FALSE.
chrfile
Name of the file for chromosome info. In this file, the first and second columns are ID and size of each chromosome, respectively.
excludeChr
Vector of chromosomes that will be excluded from the analysis. This argument is ignored if useChrfile=TRUE.
PET
Is the file paired-end tag (PET) data? If PET=FALSE, it is assumed that the file is SET data. If PET=TRUE, it is assumed that the file is PET data. Default is FALSE (SET data).
fragLen
Average fragment length. Default is 200. This argument is ignored if PET=TRUE.
span
Span used in wiggle files. Default is 200.
capping
Maximum number of reads allowed to start at each nucleotide position. To avoid potential PCR amplification artifacts, the maximum number of reads that can start at a nucleotide position is capped at capping. Capping is not applied if non-positive value is used for capping. Default is 0 (no capping).
normConst
Normalizing constant to scale values in each position.
perl
Name of the perl executable to be called. Default is "perl".

Value

Processed wig files are exported to the directory specified in outfileLoc.

Details

Wiggle files are constructed from the aligned read file and exported to the directory specified in outfileLoc argument. If byChr=FALSE, wiggle files are named as [infileName]_fragL[fragLen]_span[span].wig for SET data (PET = FALSE) and [infileName]_span[span].wig for PET data (PET = TRUE). If byChr=TRUE, wiggle files are named as [infileName]_fragL[fragLen]_span[span]_[chrID].wig for SET data (PET = FALSE) and [infileName]_span[span]_[chrID].wig for PET data (PET = TRUE), where chrID is chromosome IDs that reads align to. These chromosome IDs are extracted from the aligned read file.

If the file for chromosome information is provided (useChrfile=TRUE and chrfile is not NULL), only the chromosomes specified in the file will be considered. Chromosomes that are specified in excludeChr will not be included in the processed wiggle files. excludeChr argument is ignored if useChrfile=TRUE.

generateWig currently supports the following aligned read file formats for SET data (PET = FALSE): Eland result ("eland_result"), Eland extended ("eland_extended"), Eland export ("eland_export"), default Bowtie ("bowtie"), SAM ("sam"), , "bam" (BAM), BED ("bed"), and CSEM ("csem"). For PET data (PET = TRUE), the following aligned read file formats are allowed: "eland_result" (Eland result), "sam" (SAM), and "bam" (BAM).

If input file format is neither BED nor CSEM BED, this method retains only reads mapping uniquely to the reference genome.

References

Kuan, PF, D Chung, JA Thomson, R Stewart, and S Keles (2011), "A Statistical Framework for the Analysis of ChIP-Seq Data", Journal of the American Statistical Association, Vol. 106, pp. 891-903.

Chung, D, Zhang Q, and Keles S (2014), "MOSAiCS-HMM: A model-based approach for detecting regions of histone modifications from ChIP-seq data", Datta S and Nettleton D (eds.), Statistical Analysis of Next Generation Sequencing Data, Springer.

Examples

Run this code
## Not run: 
# library(mosaicsExample)
# 
# generateWig( infile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
#     fileFormat="bam", outfileLoc="~/", 
#     PET=FALSE, fragLen=200, span=200, capping=0, normConst=1 )
# ## End(Not run)

Run the code above in your browser using DataLab