Interface to phonetic coding functions.
pho_h(str)
soundex(str)
A character vector or matrix. Factors are converted to character.
str
,
containing its phonetic encoding.
Translates its argument to a phonetic code. pho_h
by J<U+001B29FD>Joerg Michael (see references) is intended for German language
and normalizes umlauts and accent characters.
soundex
is a widespread algorithm for English names. This implementation
can only handle common characters. Both algorithms strip off
non-alphabetical characters, with the exception that numbers are left
unchanged by pho_h
.
The C code for soundex
was taken from PostgreSQL 8.3.6.
A character vector or matrix with the same size and dimensions as str
,
containing its phonetic encoding.
J<U+001B29FD>Joerg Michael, Doppelg<e4>nger gesucht -- Ein Programm f<U+32F7B9B5>er in: c't 1999, No. 25, pp. 252--261. The Source code is published (under GPL) at http://www.heise.de/ct/ftp/99/25/252/. Andreas Borg (R interface only)
jarowinkler
and levenshteinSim
for string comparison.
misc
jarowinkler
and levenshteinSim
for string comparison.