The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The punycoder package provides high-performance Unicode
and Punycode encoding/decoding for internationalized domain names
(IDNs). It addresses critical gaps in R’s URL processing capabilities by
offering reliable, fast conversion between Unicode and ASCII
representations of domain names.
International domain names containing Unicode characters (like café.com or москва.рф) need to be converted to ASCII format for use in many network protocols and systems. Existing R packages have limitations:
punycoder provides:
library(punycoder)
# Encode Unicode domains to ASCII
puny_encode("café.com")
# Returns: "xn--caf-dma.com"
puny_encode("москва.рф")
# Returns: "xn--80adxhks.xn--p1ai"
# Decode ASCII domains back to Unicode
puny_decode("xn--caf-dma.com")
# Returns: "café.com"
# Vectorized operations
domains <- c("café.com", "москва.рф", "北京.中国")
encoded <- puny_encode(domains)
print(encoded)# Check if domain is already punycode
is_punycode("xn--caf-dma.com") # TRUE
is_punycode("café.com") # FALSE
# Check if domain contains Unicode characters
is_idn("café.com") # TRUE
is_idn("example.com") # FALSE
# Comprehensive domain validation
result <- validate_domain(c("café.com", "invalid..domain", "valid.org"))
print(result)# Example: Processing international URLs for web scraping
international_urls <- c(
"https://café.paris.fr/menu",
"https://москва.рф/news",
"https://北京.中国/info"
)
# Convert to ASCII for HTTP requests
ascii_urls <- url_encode(international_urls)
print(ascii_urls)
# Process the data...
# Convert back to Unicode for display
display_urls <- url_decode(ascii_urls)
print(display_urls)The package provides robust error handling with informative messages:
# Strict validation (default)
try({
puny_encode(c("valid.com", "")) # Empty string causes error
})
# Non-strict mode returns NA for invalid input
result <- puny_encode(c("valid.com", ""), strict = FALSE)
print(result)
# Validation provides detailed error information
validation <- validate_domain(c("valid.com", "invalid..domain", ""))
print(validation)The package is designed for high-performance processing:
You can configure package behavior using R options:
punycoder is designed to integrate well with other R
packages:
# With data.table
library(data.table)
dt <- data.table(
original = c("café.com", "москва.рф"),
encoded = puny_encode(c("café.com", "москва.рф"))
)
# With dplyr
library(dplyr)
urls_df <- data.frame(
unicode_url = c("https://café.com", "https://москва.рф")
) |>
mutate(
ascii_url = url_encode(unicode_url),
is_international = is_idn(unicode_url)
)help(package = "punycoder")The package uses a C++ backend with Rcpp for performance, and follows
RFC 3492 standards for punycode implementation. When
libidn2 is available at build time, punycoder
uses it behind the same R-level API and falls back to the built-in
implementation otherwise.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.