UTF-8 Text Processing
Maintainer: Kirill Müller kirill@cynkra.com
Authors:
Patrick O. Perry [copyright holder]
Other contributors:
Unicode, Inc. (Unicode Character Database) [copyright holder, data contributor]
Functions for manipulating and printing UTF-8 encoded text:
as_utf8() attempts to convert character data to
UTF-8, throwing an error if the data is invalid;
utf8_valid() tests whether character data is valid
according to its declared encoding;
utf8_normalize() converts text to Unicode composed normal
form (NFC), optionally applying case-folding and compatibility maps;
utf8_encode() encodes a character string, escaping all
control characters, so that it can be safely printed to the screen;
utf8_format() formats a character vector by truncating to
a specified character width limit or by left, right, or center justifying;
utf8_print() prints UTF-8 character data to the screen;
utf8_width() measures the display width of UTF-8 character
strings (many emoji and East Asian characters are twice as wide as other
characters);
output_ansi() and output_utf8() test for the
output connections capabilities.
For a complete list of functions, use library(help = "utf8").
Useful links: