stri_encode(str, from = NULL, to = NULL, to_raw = FALSE)stri_conv(str, from = NULL, to = NULL, to_raw = FALSE)
raw vectors to be convertedNULL or "" for default encoding
or internal encoding marks usage (see Details);
otherwise, a single string with encoding name,
see stri_enc_listNULL or "" for default encoding
(see stri_enc_get),
or a single string with encoding nameto_raw is FALSE,
then a character vector with encoded strings (and sensible
encoding marks) is returned.
Otherwise, a list of raw vectors is produced.stri_conv is an alias for stri_encode.These two functions aim to replace R's iconv.
It is not only faster, but also
works in the same manner on all platforms.
Please refer to stri_enc_list for the list
of supported encodings and stringi-encoding
for a general discussion.
If str is a character vector
and from is either missing, "", or NULL,
then the declared encodings are used
(see stri_enc_mark) -- in such a case bytes-declared
strings are disallowed.
Otherwise, the internal encoding declarations are ignored and
a converter selected with from is used.
On the other hand, for str being a raw vector
or a list of raw vectors,
we assume that the input encoding is the current default encoding
as given by stri_enc_get.
For to_raw=FALSE, the output
strings have always marked encodings according to the target converter
used (as specified by to) and the current default Encoding
(ASCII, latin1, UTF-8, native,
or bytes in all other cases).
Note that problems may occur if to indicates e.g UTF-16 or UTF-32,
as the output strings may have embedded NULs.
In such cases use to_raw=TRUE and consider
specifying a byte order marker (BOM) for portability reasons
(e.g. set UTF-16 or UTF-32 which automatically
adds BOMs).
Note that stri_encode(as.raw(data), "encodingname")
is a wise substitute for rawToChar.
In the current version of
Converters -- ICU User Guide,
stri_enc_fromutf32;
stri_enc_toascii;
stri_enc_tonative;
stri_enc_toutf32;
stri_enc_toutf8;
stringi-encoding