Keywords: R, text processing, character strings, internationalization, localization, ICU, ICU4C, i18n, l10n, Unicode.
Homepage: http://www.gagolewski.com/software/stringi/
License: The BSD-3-clause license for the package code, the ICU license for the accompanying ICU4C distribution, and the UCD license for the Unicode Character Database. See the COPYRIGHTS and LICENSE file for more details.
stri_datetime_format
for date/time formatting
and parsing. Also refer to the links therein for other date/time/time zone-
related operations. stri_stats_general
and stri_stats_latex
for gathering some fancy statistics on a character vector's contents. stri_join
, stri_dup
, %s+%
,
and stri_flatten
for concatenation-based operations. stri_sub
for extracting and replacing substrings,
and stri_reverse
for a joyful function
to reverse all code points in a string. stri_length
(among others) for determining the number
of code points in a string. See also stri_count_boundaries
for counting the number of Unicode characters
and stri_width
for approximating the width of a string. stri_trim
(among others) for
trimming characters from the beginning or/and end of a string,
see also stringi-search-charclass, and stri_pad
for padding strings so that they are of the same width.
Additionally, stri_wrap
wraps text into lines. stri_trans_tolower
(among others) for case mapping,
i.e., conversion to lower, UPPER, or Title Case,
stri_trans_nfc
(among others) for Unicode normalization,
stri_trans_char
for translating invidual code points,
and stri_trans_general
for other very general yet powerful
text transforms, including transliteration. stri_cmp
, %s<%< a="">%<>
, stri_order
,
stri_sort
, stri_unique
, and
stri_duplicated
for collation-based,
locale-aware operations, see also stringi-locale. stri_split_lines
(among others)
to split a string into text lines. stri_escape_unicode
(among others) for escaping
certain code points. stri_rand_strings
, stri_rand_shuffle
,
and stri_rand_lipsum
for generating (pseudo)random strings. stri_read_raw
,
stri_read_lines
, and stri_write_lines
for reading and writing text files.
stri_opts_collator
for a description of the string
collation algorithm, which is used for string comparing, ordering,
sorting, case-folding, and searching.ICU -- International Components for Unicode, http://www.icu-project.org/
ICU4C API Documentation, http://www.icu-project.org/apiref/icu4c/
The Unicode Consortium, http://www.unicode.org/
UTF-8, a transformation format of ISO 10646 -- RFC 3629, http://tools.ietf.org/html/rfc3629
stringi-arguments
,
stringi-encoding
,
stringi-locale
,
stringi-search-boundaries
,
stringi-search-charclass
,
stringi-search-coll
,
stringi-search-fixed
,
stringi-search-regex
,
stringi-search