stri_enc_info: Query a Character Encoding

Description

Gets basic information on a character encoding.

stri_enc_info(enc = NULL)

enc

NULL or "" for default encoding, or a single string with encoding name

Returns a list with the following components:

Name.friendly -- Friendly encoding name: MIME Name or JAVA Name or ICU Canonical Name (the first of provided ones is selected, see below);
Name.ICU -- Encoding name as identified by ICU;
Name.* -- other standardized encoding names, e.g. Name.UTR22, Name.IBM, Name.WINDOWS, Name.JAVA, Name.IANA, Name.MIME (some of them may be unavailable for all the encodings);
ASCII.subset -- is ASCII a subset of the given encoding?;
Unicode.1to1 -- for 8-bit encodings only: are all characters translated to exactly one Unicode code point and is the translation scheme reversible?;
CharSize.8bit -- is this an 8-bit encoding, i.e. do we have CharSize.min == CharSize.max and CharSize.min == 1?;
CharSize.min -- minimal number of bytes used to represent an UChar (in UTF-16, this is not the same as UChar32)
CharSize.max -- maximal number of bytes used to represent an UChar (in UTF-16, this is not the same as UChar32, i.e. does not reflect the maximal code point representation size)

An error is raised if the provided encoding is unknown to ICU (see stri_enc_list for more details)