The Unicode standard does not formalize the notion of a character
width. Roughly based on https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c,
https://github.com/nodejs/node/blob/master/src/node_i18n.cc,
and UAX #11 we proceed as follows.
The following code points are of width 0:
- code points with general category (see stringi-search-charclass)
- Me,- Mn, and- Cf),
 
- C0and- C1control codes (general category- Cc)
- for compatibility with the- ncharfunction,
 
- Hangul Jamo medial vowels and final consonants
(code points with enumerable property - UCHAR_HANGUL_SYLLABLE_TYPEequal to- U_HST_VOWEL_JAMOor- U_HST_TRAILING_JAMO;
note that applying the NFC normalization with- stri_trans_nfcis encouraged),
 
- ZERO WIDTH SPACE (U+200B), 
Characters with the UCHAR_EAST_ASIAN_WIDTH enumerable property
equal to U_EA_FULLWIDTH or U_EA_WIDE are
of width 2.
Most emojis and characters with general category So (other symbols)
are of width 2.
SOFT HYPHEN (U+00AD) (for compatibility with nchar)
as well as any other characters have width 1.