Convert Integer Vectors to or from UTF-8-encoded Character Vectors
Conversion of UTF-8 encoded character vectors to and from integer vectors.
utf8ToInt(x) intToUtf8(x, multiple = FALSE)
- object to be converted.
- logical: should the conversion be to a single character string or multiple individual characters?
These will work in any locale, including on platforms that do not otherwise support multi-byte character sets.
- Unicode defines a name and a number of all of the glyphs it
encompasses: the numbers are called code points: they run from
utf8ToIntconverts a length-one character string encoded in UTF-8 to an integer vector of Unicode code points. As from R3.2.1 it checks validity of the input and returns
NAif it is invalid.
intToUtf8converts a numeric vector of Unicode code points either to a single character string or a character vector of single characters. (For a single character string
0is silently omitted: otherwise
0is mapped to
"". Non-integral numeric values are truncated to integers.) The
Encodingis declared as
NAinputs are mapped to
- code point
## will only display in some locales and fonts intToUtf8(0x03B2L) # Greek beta utf8ToInt("biu00dfchen") utf8ToInt("xfaxb4xbfxbfx9f")