
The function can be used to compute distances between strings.
stringDist(x, y, method = "levenshtein", mismatch = 1, gap = 1)
stringDist
returns an object of S3 class "stringDist"
inherited
from class "dist"
; cf. dist
.
character vector, first string
character vector, second string
character, name of the distance method. This must be
"levenshtein"
or "hamming"
. Default is the classical
Levenshtein distance.
numeric, distance value for a mismatch between symbols
numeric, distance value for inserting a gap
Matthias Kohl Matthias.Kohl@stamats.de
The function computes the Hamming and the Levenshtein (edit) distance of two given strings (sequences).
In case of the Hamming distance the two strings must have the same length.
In case of the Levenshtein (edit) distance a scoring and a trace-back matrix are computed
and are saved as attributes "ScoringMatrix"
and "TraceBackMatrix"
.
The characters in the trace-back matrix reflect insertion of a gap in string y
(d
: deletion), match (m
), mismatch (mm
),
and insertion of a gap in string x
(i
).
R. Merkl and S. Waack (2009). Bioinformatik Interaktiv. Wiley.
dist
, stringSim
x <- "GACGGATTATG"
y <- "GATCGGAATAG"
## Levenshtein distance
d <- stringDist(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")
## Hamming distance
stringDist(x, y)
Run the code above in your browser using DataLab