romanization: Romanization of Japanese

Description

Japanese characters in a string or character vector are romanized with the their sounds for the English-speaking world. While kakasi in Nippon package works for romanization of Japanese, alternative romanization of Japanese is limitedly available with kana2roma. Unlike the kakasi function, kana2roma works without any help of an external library.

Usage

kana2roma(x, type = c("Hepburn", "Nippon.shiki", "Kunrei.shiki"), 
    cap = FALSE, ascii.only = TRUE)

Arguments

A character vector including Japanese Hiragana or Katakana

type

A character string specifying the type of romanization. Default is "Hepburn"

cap

logical. Capital letters to be uppercased, Default is FALSE

ascii.only

logical. Transcribed with ASCII characters only. Default is TRUE

Value

A character vector

Details

Japanese strings are often made up a mixture of Chinese characters (Kanji), Kana (Hiragana and Katakana) and Romaji (Latin phonetical pronunciation). kana2roma transcribes Kana to Romaji without any help of external programs, such as kakasi. It should be useful especially when users want to sanitize and make readable Japanese strings in data set for the English-speaking world. The function supports three main romanization systems. Although the Nihon-shiki (ISO3602 Strict) is the official system in Japan, Hepburn is most widely used especially for proper noun, and officially adopted in naming systems for railway station and roads. A variant of Hepburn is authorized by the Japanese Foreign Ministry for use in passports.

For place names or other proper nouns, set ``cap = TRUE'' in kana2roma (default is FALSE) to capitalize the first letters in Romaji strings.

Set ``ascii.only = TRUE'' in kana2roma (this is default) if a user needs to suppress non-ASCII Romaji. Otherwise, a pure romanization system may return values with non-ASCII codes, that is, macron.

Examples

Run this code

# NOT RUN {
	library(Nippon)
	jpn <- c(hiragana()[21:25], katakana()[26:30])
	kana2roma(jpn)
# }