Keywords: R, text processing, character strings, internationalization, localization, ICU, ICU4C, i18n, l10n, Unicode.
Homepage:
License: The BSD-3-clause license for the package code, the ICU license for the accompanying ICU4C distribution, and the UCD license for the Unicode Character Database. See the COPYRIGHTS and LICENSE file for more details.
stri_datetime_formatfor date/time formatting
and parsing. Also refer to the links therein for other date/time/time zone-
related operations.stri_stats_generalandstri_stats_latexfor gathering some fancy statistics on a character vector's contents.stri_join,stri_dup,%s+%,
andstri_flattenfor concatenation-based operations.stri_subfor extracting and replacing substrings,
andstri_reversefor a joyful function
to reverse all code points in a string.stri_length(among others) for determining the number
of code points in a string. See alsostri_count_boundariesfor counting the number ofUnicode charactersandstri_widthfor approximating the width of a string..stri_trim(among others) for
trimming characters from the beginning or/and end of a string,
see alsostringi-search-charclass, andstri_padfor padding strings so that they are of the same width.
Additionally,stri_wrapwraps text into lines.stri_trans_tolower(among others) for case mapping,
i.e. conversion to lower, UPPER, or Title Case,stri_trans_nfc(i.a.) for Unicode normalization,stri_trans_charfor translating invidual code points,
andstri_trans_generalfor other very general yet powerful
text transforms, including transliteration.stri_cmp,%s<%< a="">%<>,stri_order,stri_sort,stri_unique, andstri_duplicatedfor collation-based,
locale-aware operations, see alsostringi-locale.stri_split_lines(among others)
to split a string into text lines.stri_escape_unicode(among others) for escaping
certain code points.stri_rand_strings,stri_rand_shuffle,
andstri_rand_lipsumfor generating (pseudo)random strings.stri_read_raw,stri_read_lines, andstri_write_linesfor reading and writing text files.Note that each man page provides many further links to other interesting facilities and topics.
stri_opts_collatorfor a description of the string
collation algorithm, which is used for string comparing, ordering,
sorting, case-folding, and searching.ICU -- International Components for Unicode,
ICU4C API Documentation,
The Unicode Consortium,
UTF-8, a transformation format of ISO 10646 -- RFC 3629,
stringi-arguments;
stringi-encoding;
stringi-locale;
stringi-search-boundaries;
stringi-search-charclass;
stringi-search-coll;
stringi-search-fixed;
stringi-search-regex;
stringi-search