The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
There are several “helper” functions which can simplify the definition of complex patterns. First we define some functions that will help us display the patterns:
function(pat){
one.pattern <-if(is.character(pat)){
patelse{
}::var_args_list(pat)[["pattern"]]
nc
}
} function(...){
show.patterns <- list(...)
L <-str(lapply(L, one.pattern))
}
nc::field
for reducing repetitionThe nc::field
function can be used to avoid repetition when defining patterns of the form variable: value
. The example below shows three (mostly) equivalent ways to write a regex that captures the text after the colon and space; the captured text is stored in the variable
group or output column:
show.patterns(
"variable: (?<variable>.*)", #repetitive regex string
list("variable: ", variable=".*"),#repetitive nc R code
::field("variable", ": ", ".*"))#helper function avoids repetition
nc#> List of 3
#> $ : chr "variable: (?<variable>.*)"
#> $ : chr "(?:variable: (.*))"
#> $ : chr "(?:variable: (?:(.*)))"
Note that the first version above has a named capture group, whereas the second and third patterns generated by nc have an un-named capture group and some non-capturing groups (but they all match the same pattern).
Another example:
show.patterns(
"Alignment (?<Alignment>[0-9]+)",
list("Alignment ", Alignment="[0-9]+"),
::field("Alignment", " ", "[0-9]+"))
nc#> List of 3
#> $ : chr "Alignment (?<Alignment>[0-9]+)"
#> $ : chr "(?:Alignment ([0-9]+))"
#> $ : chr "(?:Alignment (?:([0-9]+)))"
Another example:
show.patterns(
"Chromosome:\t+(?<Chromosome>.*)",
list("Chromosome:\t+", Chromosome=".*"),
::field("Chromosome", ":\t+", ".*"))
nc#> List of 3
#> $ : chr "Chromosome:\t+(?<Chromosome>.*)"
#> $ : chr "(?:Chromosome:\t+(.*))"
#> $ : chr "(?:Chromosome:\t+(?:(.*)))"
nc::quantifier
for fewer parenthesesAnother helper function is nc::quantifier
which makes patterns easier to read by reducing the number of parentheses required to define sub-patterns with quantifiers. For example all three patterns below create an optional non-capturing group which contains a named capture group:
show.patterns(
"(?:-(?<chromEnd>[0-9]+))?", #regex string
list(list("-", chromEnd="[0-9]+"), "?"), #nc pattern using lists
::quantifier("-", chromEnd="[0-9]+", "?"))#quantifier helper function
nc#> List of 3
#> $ : chr "(?:-(?<chromEnd>[0-9]+))?"
#> $ : chr "(?:(?:-([0-9]+))?)"
#> $ : chr "(?:(?:-([0-9]+))?)"
Another example with a named capture group inside an optional non-capturing group:
show.patterns(
"(?: (?<name>[^,}]+))?",
list(list(" ", name="[^,}]+"), "?"),
::quantifier(" ", name="[^,}]+", "?"))
nc#> List of 3
#> $ : chr "(?: (?<name>[^,}]+))?"
#> $ : chr "(?:(?: ([^,}]+))?)"
#> $ : chr "(?:(?: ([^,}]+))?)"
nc::alternatives
for simplified alternationWe also provide a helper function for defining regex patterns with alternation. The following three lines are equivalent.
show.patterns(
"(?:(?<first>bar+)|(?<second>fo+))",
list(first="bar+", "|", second="fo+"),
::alternatives(first="bar+", second="fo+"))
nc#> List of 3
#> $ : chr "(?:(?<first>bar+)|(?<second>fo+))"
#> $ : chr "(?:(bar+)|(fo+))"
#> $ : chr "(?:(bar+)|(fo+))"
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.