Implements an approximate string matching version of R's native 'match' function. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well.

seq_amatch Approximate matching for integer sequences.
stringdist_api Calling stringdist from C or C++
seq_sim Compute similarity scores between sequences of integers
printable_ascii Detect the presence of non-printable or non-ascii characters
phonetic Phonetic algorithms
seq_dist Compute distance metrics between integer sequences
stringdist-encoding String metrics in stringdist
seq_qgrams Get a table of qgram counts for integer sequences
amatch Approximate string matching
qgrams Get a table of qgram counts from one or more character vectors.
stringdist-parallelization Multithreading and parallelization in stringdist
stringsim Compute similarity scores between strings
stringdist Compute distance metrics between strings
stringdist-package A package for string distance calculation and approximate string matching.
stringdist-metrics String metrics in stringdist
