assignSeq2REsite: Assign mapped sequence tags to corresponding restriction enzyme (RE) cut sites

Description

Given the sequence tags aligned to a genome as a RangedData, and a map built using the buildREmap function, assignSeq2REsite first identifies RE sites that have mapped sequence tags around the cut position taking consideration of user-defined offset, sequence length and strand in the aligned sequences. These RE sites are used as seeds for assigning the remaining tags depending on which of five strategies the users select for partitioning sequences associated with multiple RE sites, i.e., unique, average,estimate, best and random.

Usage

assignSeq2REsite(input.RD, REmap.RD, cut.offset = 1, seq.length = 36, 
allowed.offset = 5, min.FragmentLength = 60, max.FragmentLength = 300,  
partitionMultipleRE = c("unique", "average", "estimate","best", "random"))

Arguments

input.RD

RangedData as mapped sequences: see example below

REmap.RD

RangedData as restriction enzyme (RE) cut site map: see example below

cut.offset

The cut offset from the start of the RE recognition sequence: index is 0 based, i.e.,1 means the RE cuts at position 2.

seq.length

Sequence length: 36 means that the sequence tags are 36-base long.

allowed.offset

Offset allowed to count for imperfect sticky end repair and primer addition.

min.FragmentLength

Minimum fragment length of the sequences size-selected for sequencing

max.FragmentLength

Maximum fragment length of the sequences size-selected for sequencing

partitionMultipleRE

The strategy for partitioning sequences associated with multiple RE sites. For strategy unique, only sequence tags that are associated with a unique RE site within the distance between min.FragmentLength and max.FragmentLength are kept for downstream analysis. For strategy average, sequence tags are partitioned equally among associated RE sites. For strategy estimate, sequence tags are partitioned among associated RE sites with a weight function, which is determined using the count distribution of the RE seed sites described in the description section above. For strategy best, sequence tags are assigned to the most probable RE sties with the same weight function as that in strategy estimate. For strategy random, the sequence tags are randomly assigned to one of the multiple associated RE sites.

Value

passed.filter: Sequences assigned to RE(s), see the example r.unique$passed.filter
notpassed.filter: Sequences not assigned to any RE, see example r.unique$notpassed.filter
mREwithDetail: Detailed assignment information for sequences associated with multiple RE sites. Only available when partitionMultipleRE is set to average or estimate, see r.estimate$mREwithDetail in the examples

References

1. Roberts, R.J., Restriction endonucleases. CRC Crit Rev Biochem, 1976. 4(2): p. 123-64. 2.Kessler, C. and V. Manta, Specificity of restriction endonucleases and DNA modification methyltransferases a review (Edition 3). Gene, 1990. 92(1-2): p. 1-248. 3. Pingoud, A., J. Alves, and R. Geiger, Restriction enzymes. Methods Mol Biol, 1993. 16: p. 107-200.

Examples

Run this code

	library(REDseq)
	data(example.REDseq)
	data(example.map)
	r.unique = assignSeq2REsite(example.REDseq, example.map, 
cut.offset = 1, seq.length = 36, allowed.offset = 5, 
min.FragmentLength = 60, max.FragmentLength = 300, 
partitionMultipleRE = "unique")
	r.average = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1, 
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
max.FragmentLength = 300, partitionMultipleRE = "average")
	r.random = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1, 
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
 max.FragmentLength = 300, partitionMultipleRE = "random")
	r.best = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1, 
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
 max.FragmentLength = 300, partitionMultipleRE = "best")
	r.estimate = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1, 
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
 max.FragmentLength = 300, partitionMultipleRE = "estimate")
	r.estimate$passed.filter
	r.estimate$notpassed.filter