Learn R Programming

alakazam (version 0.2.2)

getSeqDistance: Calculate distance between two sequences

Description

getSeqDistance calculates the distance between two DNA sequences.

Usage

getSeqDistance(seq1, seq2, dist_mat = getDNAMatrix(gap = -1))

Arguments

seq1
character string containing a DNA sequence.
seq2
character string containing a DNA sequence.
dist_mat
Character distance matrix. Defaults to a Hamming distance matrix returned by getDNAMatrix. If gap characters, c("-", "."), are assigned a value of -1 in dist_mat then conti

Value

  • Numerical distance between seq1 and seq2.

See Also

Nucleotide distance matrix may be built with getDNAMatrix. Amino acid distance matrix may be built with getAAMatrix.

Examples

Run this code
# Ungapped examples
getSeqDistance("ATGGC", "ATGGG")
getSeqDistance("ATGGC", "ATG??")

# Gaps will be treated as Ns with a gap=0 distance matrix
getSeqDistance("ATGGC", "AT--C", dist_mat=getDNAMatrix(gap=0))

# Gaps will be treated as universally non-matching characters with gap=1
getSeqDistance("ATGGC", "AT--C", dist_mat=getDNAMatrix(gap=1))

# Gaps of any length will be treated as single mismatches with a gap=-1 distance matrix
getSeqDistance("ATGGC", "AT--C", dist_mat=getDNAMatrix(gap=-1))

# Gaps of equivalent run lengths are not counted as gaps
getSeqDistance("ATG-C", "ATG-C", dist_mat=getDNAMatrix(gap=-1))

# Overlapping runs of gap characters are counted as a single gap
getSeqDistance("ATG-C", "AT--C", dist_mat=getDNAMatrix(gap=-1))
getSeqDistance("A-GGC", "AT--C", dist_mat=getDNAMatrix(gap=-1))
getSeqDistance("AT--C", "AT--C", dist_mat=getDNAMatrix(gap=-1))

# Discontiguous runs of gap characters each count as separate gaps
getSeqDistance("-TGGC", "AT--C", dist_mat=getDNAMatrix(gap=-1))

Run the code above in your browser using DataLab