stringi (version 1.8.3)

stri_enc_isutf16be: Check If a Data Stream Is Possibly in UTF-16 or UTF-32

Description

These functions detect whether a given byte stream is valid UTF-16LE, UTF-16BE, UTF-32LE, or UTF-32BE.

Usage

stri_enc_isutf16be(str)

stri_enc_isutf16le(str)

stri_enc_isutf32be(str)

stri_enc_isutf32le(str)

Value

Returns a logical vector.

Arguments

str

character vector, a raw vector, or a list of raw vectors

Author

Marek Gagolewski and other contributors

Details

These functions are independent of the way R marks encodings in character strings (see Encoding and stringi-encoding). Most often, these functions act on raw vectors.

A result of FALSE means that a string is surely not valid UTF-16 or UTF-32. However, false positives are possible.

Also note that a data stream may be sometimes classified as both valid UTF-16LE and UTF-16BE.

See Also

The official online manual of stringi at https://stringi.gagolewski.com/

Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, tools:::Rd_expr_doi("10.18637/jss.v103.i02")

Other encoding_detection: about_encoding, stri_enc_detect2(), stri_enc_detect(), stri_enc_isascii(), stri_enc_isutf8()