Usage
getUniqueCleavageEvents(alignment.inputfile, umi.inputfile, alignment.format = c("auto", "bam", "bed"), umi.header = FALSE, read.ID.col = 1,
umi.col = 2, umi.sep = "\t", keep.R1only = TRUE, keep.R2only = TRUE,
concordant.strand = TRUE, max.paired.distance = 1000,
min.mapping.quality = 30, max.R1.len = 130, max.R2.len = 130,
apply.both.max.len = FALSE, same.chromosome = TRUE,
distance.inter.chrom = -1, min.R1.mapped = 20, min.R2.mapped = 20,
apply.both.min.mapped = FALSE, max.duplicate.distance = 0,
umi.plus.R1start.unique = TRUE, umi.plus.R2start.unique = TRUE,
n.cores.max = 6)
Arguments
alignment.inputfile
The alignment file. Currently supports bed output file with CIGAR information.
Suggest run the workflow binReads.sh, which sequentially runs barcode binning,
adaptor removal, alignment to genome, alignment quality filtering, and bed file
conversion. Please download the workflow function and its helper scripts at
http://mccb.umassmed.edu/GUIDE-seq/binReads/
umi.inputfile
A text file containing at least two columns, one is the read identifier and the
other is the UMI or UMI plus the first few bases of R1 reads. Suggest use
getUMI.sh to generate this file. Please download the script and its helper
scripts at http://mccb.umassmed.edu/GUIDE-seq/getUMI/
alignment.format
The format of the alignment input file. Currently only
bam and bed file format is supported. BED format will be deprecated soon.
umi.header
Indicates whether the umi input file contains a header line or not.
Default to FALSE
read.ID.col
The index of the column containing the read identifier in the umi input file,
default to 1
umi.col
The index of the column containing the umi or umi plus the first few bases
of sequence from the R1 reads, default to 2
umi.sep
column separator in the umi input file, default to tab
keep.R1only
Specify whether to include alignment with only R1 without paired R2.
Default TRUE
keep.R2only
Specify whether to include alignment with only R2 without paired R1.
Default TRUE
concordant.strand
Specify whether the R1 and R2 should be aligned to the same strand or opposite
strand. Default opposite.strand (TRUE)
max.paired.distance
Specify the maximum distance allowed between paired R1 and R2 reads.
Default 1000 bp
min.mapping.quality
Specify min.mapping.quality of acceptable alignments
max.R1.len
The maximum retained R1 length to be considered for downstream analysis,
default 130 bp. Please note that default of 130 works well when the read
length 150 bp. Please also note that retained R1 length is not necessarily
equal to the mapped R1 length
max.R2.len
The maximum retained R2 length to be considered for downstream analysis,
default 130 bp. Please note that default of 130 works well when the read
length 150 bp. Please also note that retained R2 length is not necessarily
equal to the mapped R2 length
apply.both.max.len
Specify whether to apply maximum length requirement to both R1 and R2 reads,
default FALSE
same.chromosome
Specify whether the paired reads are required to align to the same chromosome,
default TRUE
distance.inter.chrom
Specify the distance value to assign to the paired reads that are aligned to
different chromosome, default -1
min.R1.mapped
The maximum mapped R1 length to be considered for downstream analysis,
default 30 bp.
min.R2.mapped
The maximum mapped R2 length to be considered for downstream analysis,
default 30 bp.
apply.both.min.mapped
Specify whether to apply minimum mapped length requirement to both R1 and
R2 reads, default FALSE
max.duplicate.distance
Specify the maximum distance apart for two reads to be considered as
duplicates, default 0. Currently only 0 is supported
umi.plus.R1start.unique
To specify whether two mapped reads are considered as unique if both
containing the same UMI and same alignment start for R1 read, default TRUE.
umi.plus.R2start.unique
To specify whether two mapped reads are considered as unique if both
containing the same UMI and same alignment start for R2 read, default TRUE.
n.cores.max
Indicating maximum number of cores to use in multi core mode,
i.e., parallel processing, default 6. Please set it to 1 to disable multicore
processing for small dataset.