stringi (version 1.1.6)

stri_enc_toutf32: Convert Strings To UTF-32

Description

UTF-32 is a 32bit encoding in which each Unicode code point corresponds to exactly one integer value. This function converts a character vector to a list of integer vectors so that e.g. individual code points may easily be accessed, changed, etc.

Usage

stri_enc_toutf32(str)

Arguments

str

a character vector (or an object coercible to such a vector) to be converted

Value

Returns a list of integer vectors. Missing values are converted to NULLs.

Details

See stri_enc_fromutf32 for a dual operation.

This function is roughly equivalent to a vectorized call to utf8ToInt(enc2utf8(str)). If you want a list of raw vector on output, use stri_encode.

Unlike utf8ToInt, if improper UTF-8 byte sequences are detected, a corresponding element is set to NULL and a warning is given, see also stri_enc_toutf8 for a method to deal with such cases.

See Also

Other encoding_conversion: stri_enc_fromutf32, stri_enc_toascii, stri_enc_tonative, stri_enc_toutf8, stri_encode, stringi-encoding