OFDEG: Oligonucleotide Frequency Derived Error Gradient
Description
Oligonucleotide Frequency Derived Error Gradient computes approximate convergence rate of oligonucleotide frequencies with subsequent increasing sequence length.
Usage
OFDEG(sequence, c, rc, d, m, t, k, norm=0)
Value
This function returns a data frame containing error gradients of each nucleotide sequence.
Arguments
sequence
Input is a fasta file nucleic acid sequence.It accepts RData object of the fasta file
c
Minimum sequence cutoff c (which corresponds to the length of the shortest sequence in the
data set). Default is 160.
rc
Cutoff of Resampling Depth (Number of subsequence of cutoff length). Default is set to 10.
d
Sampling depth (The sampling depth refers to the number of equal length sub-sequences randomly selected from the
entire sequence). Default is set to 10. Larger sequence lengths will require greater sampling depths.
m
Word size which is initial subsequence length. Default is set to 100.
t
Step size (The step size is the change in sub-sequence length from one sampling instance to the next). Default is set to 6.
k
Size of the oligonucleotide (e.g.for tetranucleotide it is 4,for hexanucleotide it is 6 ). Default is set to 1.
norm
normalization of oligonucleotide frequency (OF) Profile (0 - no normalization, 1 - normalize the OF profile). Default is set to norm = 0.
Author
Dr. Anu Sharma,
Dr. Sanjeev Kumar
Details
Oligonucleotide Frequency Derived Error Gradient (OFDEG) attempts to capture the convergence behavior by subsampling the genomic fragment and measuring the decrease in error as the length of the subsamples increases upto the fragment lenth.OFDEG, derived from the oligonucleotide frequency profile of a DNA sequence shows that it is possible to obtain a meaningful phylogenetic signal for relatively short DNA sequences.
References
Saeed, I., Halgamuge, S.K. The oligonucleotide frequency derived error gradient and its application to the binning of metagenome fragments. BMC Genomics 10, S10 (2009).