tau (version 0.0-21)

translate: Translate Unicode Latin Ligatures

Description

Translate Unicode “Latin ligature” characters to their respective constituents.

Usage

translate_Unicode_latin_ligatures(x)

Arguments

x

a character vector in UTF-8 encoding.

Details

In typography, a ligature occurs where two or more graphemes are joined as a single glyph. (See http://en.wikipedia.org/wiki/Typographic_ligature for more information.)

Unicode (http://www.unicode.org/) lists the following “Latin” ligatures:

Code Name
0132 LATIN CAPITAL LIGATURE IJ
0133 LATIN SMALL LIGATURE IJ
0152 LATIN CAPITAL LIGATURE OE
0153 LATIN SMALL LIGATURE OE
FB00 LATIN SMALL LIGATURE FF
FB01 LATIN SMALL LIGATURE FI
FB02 LATIN SMALL LIGATURE FL
FB03 LATIN SMALL LIGATURE FFI
FB04 LATIN SMALL LIGATURE FFL
FB05 LATIN SMALL LIGATURE LONG S T

translate_Unicode_latin_ligatures translates these to their respective constituent characters.