Learn R Programming

⚠️There's a newer version (0.9.15) of this package.Take me there.

stringdist (version 0.9.5.0)

Approximate String Matching and String Distance Functions

Description

Implements an approximate string matching version of R's native 'match' function. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. Stringdist is built for speed and paralellizes computation using 'openMP'. An API for C or C++ is exposed as well.

Copy Link

Version

Install

install.packages('stringdist')

Monthly Downloads

64,244

Version

0.9.5.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Mark der Loo

Last Published

June 7th, 2018

Functions in stringdist (0.9.5.0)

printable_ascii

Detect the presence of non-printable or non-ascii characters
qgrams

Get a table of qgram counts from one or more character vectors.
seq_qgrams

Get a table of qgram counts for integer sequences
seq_sim

Compute similarity scores between sequences of integers
phonetic

Phonetic algorithms
amatch

Approximate string matching
stringdist-metrics

String metrics in stringdist
stringdist-package

A package for string distance calculation and approximate string matching.
stringdist_api

Calling stringdist from C or C++
stringdist-encoding

String metrics in stringdist
stringsim

Compute similarity scores between strings
seq_amatch

Approximate matching for integer sequences.
seq_dist

Compute distance metrics between integer sequences
stringdist-parallelization

Multithreading and parallelization in stringdist
stringdist

Compute distance metrics between strings