Learn R Programming

punycoder (version 1.1.0)

puny_encode: Encode Unicode domains to ASCII punycode

Description

Converts Unicode domain names to their ASCII punycode representation following RFC 3492 standards. This function is essential for processing internationalized domain names (IDNs) in web scraping and URL analysis.

Usage

puny_encode(x, strict = getOption("punycoder.strict", TRUE))

Value

A character vector the same length as x, with each element containing the ASCII punycode-encoded domain name. Elements corresponding to NA inputs are NA_character_. In non-strict mode, domains that fail encoding are also returned as NA_character_.

Arguments

x

Character vector of Unicode domain names to encode

strict

Logical; whether to apply strict validation. Defaults to `getOption("punycoder.strict", TRUE)`.

See Also

puny_decode for the reverse operation, url_encode for full URL encoding.

Examples

Run this code
# \donttest{
# Basic encoding
puny_encode("caf\u00E9.com")
puny_encode("\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444")

# Vectorized encoding
domains <- c(
  "caf\u00E9.com",
  "\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444",
  "\u5317\u4EAC.\u4E2D\u56FD"
)
puny_encode(domains)
# }

Run the code above in your browser using DataLab