Preprocessing (cleaning) of strings prior to linkage.
StandardizeString(strings)
Returns a character vector with standardized strings.
A character vector of strings to be standardized.
Strings are capitalized, letters are substituted as described below. Leading and trailing blanks are removed. Other non-ASCII characters are deleted.
Replace "Æ" with "AE"
Replace "æ" with "AE"
Replace "Ä" with "AE"
Replace "ä" with "AE"
Replace "Å" with "A"
Replace "å" with "A"
Replace "Â" with "A"
Replace "â" with "A"
Replace "À" with "A"
Replace "à" with "A"
Replace "Á" with "A"
Replace "á" with "A"
Replace "Ç" with "C"
Replace "Ç" with "C"
Replace "Ê" with "E"
Replace "ê" with "E"
Replace "È" with "E"
Replace "è" with "E"
Replace "É" with "E"
Replace "é" with "E"
Replace "Ï" with "I"
Replace "ï" with "I"
Replace "Î" with "I"
Replace "î" with "I"
Replace "Ì" with "I"
Replace "ì" with "I"
Replace "Í" with "I"
Replace "í" with "I"
Replace "Ö" with "OE"
Replace "ö" with "OE"
Replace "Ø" with "O"
Replace "ø" with "O"
Replace "Ô" with "O"
Replace "ô" with "O"
Replace "Ò" with "O"
Replace "ò" with "O"
Replace "Ó" with "O"
Replace "ó" with "O"
Replace "ß" with "SS"
Replace "Ş" with "S"
Replace "ş" with "S"
Replace "ü" with "UE"
Replace "Ü" with "UE"
Replace "Ů" with "U"
Replace "Û" with "U"
Replace "û" with "U"
Replace "Ù" with "U"
Replace "ù" with "U"
strings = c("Päter", " Jürgen", " Roß")
StandardizeString(strings)
Run the code above in your browser using DataLab