DescTools (version 0.99.13)

StrDist: Compute Distances Between Strings

Description

StrDist computes distances between strings following to Levenshtein or Hamming method.

Usage

StrDist(x, y, method = "levenshtein", mismatch = 1, gap = 1)

Arguments

x
character vector, first string.
y
character vector, second string.
method
character, name of the distance method. This must be "levenshtein" or "hamming". Default is the classical Levenshtein distance.
mismatch
numeric, distance value for a mismatch between symbols.
gap
numeric, distance value for inserting a gap.

Value

  • StrDist returns an object of class "dist"; cf. dist.

Details

The function computes the Hamming and the Levenshtein (edit) distance of two given strings (sequences). The Hamming distance between two vectors is the number mismatches between corresponding entries. In case of the Hamming distance the two strings must have the same length. In case of the Levenshtein (edit) distance a scoring and a trace-back matrix are computed and are saved as attributes "ScoringMatrix" and "TraceBackMatrix". The numbers in the trace-back matrix reflect insertion of a gap in string y (1), match/missmatch (2), and insertion of a gap in string x (3).

References

R. Merkl and S. Waack (2009) Bioinformatik Interaktiv. Wiley.

See Also

adist, dist

Examples

Run this code
x <- "GACGGATTATG"
y <- "GATCGGAATAG"
## Levenshtein distance
d <- StrDist(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

## Hamming distance
StrDist(x, y, method="hamming")

Run the code above in your browser using DataLab