Learn R Programming

punycoder (version 1.1.0)

parse_url: Parse URLs with internationalized domain name handling

Description

Parses URLs and returns a structured list with proper handling of internationalized domain names. This function provides both Unicode and ASCII representations of domain components.

Usage

parse_url(url, encode_domains = FALSE)

Value

An object of class "punycoder_parsed_url" (a named list) with components:

scheme

Character vector of URL schemes (e.g., "https").

domain

Character vector of domain names.

port

Integer vector of port numbers.

path

Character vector of URL paths.

query

Character vector of query strings.

fragment

Character vector of fragment identifiers.

Each component has one element per input URL. Invalid URLs yield

NA components. For valid URLs without an explicit path,

path is returned as "".

Arguments

url

Character vector of URLs to parse

encode_domains

Logical flag; encode parsed host names to ASCII.

See Also

url_encode, url_decode for URL transformation with IDN handling.

Examples

Run this code
# \donttest{
# Parse URL with Unicode domain
parse_url(
  "https://caf\u00E9.example.com:8080/path?query=value#fragment"
)

# Parse multiple URLs
urls <- c(
  "https://caf\u00E9.com/menu",
  "https://\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444/info"
)
parse_url(urls)
# }

Run the code above in your browser using DataLab