The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette picks up where the “YAML in 2 Minutes” intro leaves off. It shows what YAML tags are and how to work with them in yaml12 with tag handlers. Along the way we also cover complex-valued keys and node anchors, so you can work with any advanced YAML (version 1.2) you might see in the wild.
parse_yaml() and read_yaml() accept
handlers: a named list of functions whose names are YAML
tag strings. Handlers run on any matching tagged node. For tagged
scalars, the handler always receives a length-1 string; for tagged
sequences or mappings, it receives an R vector representing that
node.
Here is an example of using a handler to evaluate !expr
nodes.
handlers <- list(
"!expr" = function(x) eval(str2lang(x), globalenv())
)
parse_yaml("!expr 1+1", handlers = handlers)
#> [1] 2Any errors from a handler stop parsing:
parse_yaml("!expr stop('boom')", handlers = handlers)
#> Error in eval(str2lang(x), globalenv()): boomAny tag without a matching handler is left preserved as
yaml_tag attribute, and handlers without matching tags are
left unused.
handlers <- list(
"!expr" = function(x) eval(str2lang(x), globalenv()),
"!upper" = toupper,
"!lower" = tolower # unused
)
str(parse_yaml(handlers = handlers, "
- !expr 1+1
- !upper r is awesome
- !note this tag has no handler
"))
#> List of 3
#> $ : num 2
#> $ : chr "R IS AWESOME"
#> $ : chr "this tag has no handler"
#> ..- attr(*, "yaml_tag")= chr "!note"With a tagged sequence, the handler is called with an unnamed R list,
or an atomic vector if simplify = TRUE and all the sequence
elements are a common type. With tagged mappings the handler is called
with a named R list, potentially with a yaml_keys attribute
(more on this in the next section).
handlers <- list(
"!some_seq_tag" = function(x) {
stopifnot(identical(x, c("a", "b")))
"some handled value"
},
"!some_map_tag" = function(x) {
stopifnot(identical(x, list(key1 = 1L, key2 = 2L)))
"some other handled value"
}
)
yaml_tagged_containers <- "
- !some_seq_tag [a, b]
- !some_map_tag {key1: 1, key2: 2}
"
str(parse_yaml(yaml_tagged_containers, handlers = handlers))
#> List of 2
#> $ : chr "some handled value"
#> $ : chr "some other handled value"Handlers make it easy to opt into powerful behaviors (like evaluating
!expr nodes) while keeping the default parser strict and
safe.
yaml_keysIn YAML, mapping keys do not have to be plain scalar strings; any arbitrary YAML node can be a key: including other scalar types, sequences, and even other mappings. For example, this is valid YAML even though the key is a boolean:
When yaml12 sees a mapping key that is not a untagged string
scalar, it keeps the original keys in a yaml_keys attribute
next to the values:
For complex key values, YAML uses the explicit mapping-key indicator
?. A line starting with ? introduces the key node (of any
type) of a mapping, and the following line that starts with
: holds its value:
In yaml12 you can see those keys via the yaml_keys
attribute:
yaml <- "
true: true
? [a, b]
: tuple
? {x: 1, y: 2}
: map-key
"
str(parse_yaml(yaml))
#> List of 3
#> $ : logi TRUE
#> $ : chr "tuple"
#> $ : chr "map-key"
#> - attr(*, "yaml_keys")=List of 3
#> ..$ : logi TRUE
#> ..$ : chr [1:2] "a" "b"
#> ..$ :List of 2
#> .. ..$ x: int 1
#> .. ..$ y: int 2If you supply handlers, they run on keys as well, so a handler can
turn tagged keys into friendly R names before yaml_keys
needs to be attached. If all the mapping keys resolve to bare scalar
strings, then a yaml_keys attribute is not attached.
handlers <- list(
"!upper" = toupper,
"!airport" = function(x) paste0("IATA:", toupper(x))
)
yaml_tagged_key <- "
!upper newyork: !airport jfk
!upper warsaw: !airport waw
"
str(parse_yaml(yaml_tagged_key, handlers = handlers))
#> List of 2
#> $ NEWYORK: chr "IATA:JFK"
#> $ WARSAW : chr "IATA:WAW"If you anticipate tagged mapping keys that you want to process
yourself, you’ll need a bit more bookkeeping. The yaml_keys
attribute is materialized whenever any key is not a plain, untagged
string scalar; you’ll want to walk those keys alongside the values and
optionally collapse yaml_keys back to NULL if
all keys become plain strings after handling tagged nodes. For example,
here is the earlier eval_yaml_expr_nodes expanded to also
handle tagged mapping keys. (This expanded postprocessor is equivalent
to passing
handlers = list("!expr" = \(x) eval(str2lang(x), globalenv())))
is_bare_string <- \(x) {
is.character(key) && length(key) == 1L && is.null(attributes(key))
}
eval_yaml_expr_nodes <- function(x) {
if (is.list(x)) {
x <- lapply(x, eval_yaml_expr_nodes)
if (!is.null(keys <- attr(x, "yaml_keys", TRUE))) {
keys <- lapply(keys, eval_yaml_expr_nodes)
names(x) <- sapply(
\(name, key) if (name == "" && is_bare_string(key)) key else name,
names(x),
keys
)
attr(x, "yaml_keys") <-
if (all(sapply(keys, is_bare_string))) NULL else keys
}
}
if (identical(attr(x, "yaml_tag", TRUE), "!expr")) {
x <- eval(str2lang(x), globalenv())
}
x
}Because you control the traversal, you can add extra checks (for example, only allowing expressions under certain mapping keys).
Most YAML files contain a single YAML document. YAML also
supports document streams, where a file or string holds
multiple YAML documents. Documents are separated by a start marker
(---) and may optionally include an end marker
(...).
For the reading functions (read_yaml(),
parse_yaml()), the multi argument defaults to
FALSE. In this mode, only the first YAML document is read.
If an end marker (...) or a new start marker
(---) is encountered, the parser stops and returns only the
first document. When multi = TRUE, all documents in the
stream are returned.
For the writing functions (write_yaml(),
format_yaml()), multi also defaults to
FALSE, producing a single YAML document. When
multi = TRUE, the provided R object is treated as a list of
documents and written as a YAML document stream, with documents
separated by the start marker ---. Regardless of
multi, write_yaml() always includes an initial
start marker and a final end marker.
write_yaml(list("foo", "bar"))
#> ---
#> - foo
#> - bar
#> ...
write_yaml(list("foo", "bar"), multi = TRUE)
#> ---
#> foo
#> ---
#> bar
#> ...When multi = FALSE, parsing stops after the first
document—even if later content is not valid YAML. That makes it easy to
extract front matter from files that mix YAML with other text (like R
Markdown):
rmd_lines <- c(
"---",
"title: Front matter only",
"params:",
" answer: 42",
"---",
"# Body that is not YAML"
)
parse_yaml(rmd_lines)
#> $title
#> [1] "Front matter only"
#>
#> $params
#> $params$answer
#> [1] 42Here the parser returns just the YAML frontmatter because the second
--- technically ends the first YAML document in a
YAML document stream; with multi = FALSE the
parser stops there and returns just the first YAML document.
Anchors (&id) name a node; aliases
(*id) copy it. yaml12 resolves aliases before returning R
objects.
str(parse_yaml("
recycle-me: &anchor-name
a: b
c: d
recycled:
- *anchor-name
- *anchor-name
"))
#> List of 2
#> $ recycle-me:List of 2
#> ..$ a: chr "b"
#> ..$ c: chr "d"
#> $ recycled :List of 2
#> ..$ :List of 2
#> .. ..$ a: chr "b"
#> .. ..$ c: chr "d"
#> ..$ :List of 2
#> .. ..$ a: chr "b"
#> .. ..$ c: chr "d"If you want to inspect how YAML nodes are parsed directly, you can
reach for the internal helper yaml12:::dbg_yaml() to print
the raw (Rust) saphyr::Yaml structures without converting
to R objects.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.