zen2han: Convert Japanese characters from fullwidth (zenkaku) to halfwidth
(hankaku) forms
Description
This function is to convert Japanese characters from fullwidth (zenkaku) to halfwidth
(hankaku) forms for avoding trouble in Japanese string operation.
Usage
zen2han(x)
Arguments
x
A character vecter.
Value
A character vector. All alphabets, numbers, and symbols have their halfwidth from.
Details
Japanese graphic characters are traditionally classed into fullwidth
(zenkaku) and halfwidth (hankaku) form. Alphabets, numbers, and symbols can
take either from, while Hiragana, Katakana, and Kanji are only available
as fullwidth characters. It causes troubles in string manipulation such as
matching or searching where the two forms of alphabets, numbers, and
symbols are mixed in. Thus, the character data should be sanitized with this
function.
The targeted zenkaku characters are shown with zenkaku constant built
into Nippon package: only alphabets and numbers. Katakana is not
the target of zen2han because the halfwidth Katakana is rather a
troublemaker.
References
Halfwidth and Fullwidth Forms http://www.alanwood.net/unicode/halfwidth_and_fullwidth_forms.html