stri_encode(str, from = NULL, to = NULL, to_raw = FALSE)stri_conv(str, from = NULL, to = NULL, to_raw = FALSE)
raw vectors to be convertedNULL or "" for
default encoding or internal encoding marks usage (see
Details); otherwise, a single string with encoding name,
see stri_enc_listNULL or "" for
default encoding (see stri_enc_get), or a
single string with encoding nameto_raw is FALSE, then a character vector
with encoded strings (and sensible encoding marks) is
returned. Otherwise, you get a list of raw vectors.stri_conv is an alias for
stri_encode.Please, refer to stri_enc_list for the list
of supported encodings and stringi-encoding for
general discussion.
If from is either missing, "", or NULL
and str is an atomic vector, then the input strings'
encoding marks are used (just like in almost all
stri_enc_get. Otherwise, the internal
encoding marks are overridden by the given encoding. On the
other hand, for str being a list of raw vectors, we
assume that the input encoding is the current default
encoding.
For to_raw=FALSE, the output strings always have
marked encodings according to the target converter used (as
specified by to) and the current default Encoding
(ASCII, latin1, UTF-8, native,
or bytes in all other cases).
Note that possible problems may occur when to is set
to e.g. UTF-16 and UTF-32, as the output strings may have
embedded NULs. In such cases use to_raw=TRUE and
consider specifying a byte order marker (BOM) for
portability reasons (e.g. set UTF-16 or
UTF-32 which automatically adds BOMs).
Note that stri_encode(as.raw(data),
"8bitencodingname") is a wise substitute for
rawToChar.
Currently, if an incorrect code point is found on input, it is replaced by the default (for that target encoding) substitute character and a warning is generated.
Converters -- ICU User Guide,
stri_enc_fromutf32;
stri_enc_toascii;
stri_enc_toutf32;
stri_enc_toutf8;
stringi-encoding